Methods of Operating Quantum Computing Systems for Amplitude Estimation

ABSTRACT

This disclosure relates to enhanced methods of operating quantum computing systems to perform amplitude estimation. More than that, the methods may be tuned to accommodate for specific noise levels (e.g., in given a quantum device). Embodiments also enable quantum computing systems to perform amplitude estimation faster than amplitude estimation algorithms performed using a classical (non-quantum) computer.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application Ser. No. 63/116,712, “Algorithms for Quantum Amplitude Estimation With Noise,” filed Nov. 20, 2020, which is incorporated herein by reference in its entirety.

BACKGROUND 1. Technical Field

This disclosure relates generally to methods of operating quantum computing systems, and more particularly, to operating noisy and low depth quantum computing systems to perform amplitude estimation.

2. Description of Related Art

Quantum Amplitude estimation algorithms aim to estimate the amplitude of a unitary operator. These algorithms typically include the execution of large quantum circuits. However, current quantum computers are noisy, limited in qubits, and limited in circuit depth (referred to as noisy intermediate scale quantum (NISQ) devices). Thus, NISQ devices may not be capable of performing amplitude estimation.

SUMMARY

Embodiments described herein relate to enhanced methods of operating NISQ devices (and other quantum computing systems) to perform amplitude estimation. More than that, the methods may be tuned to accommodate for specific noise levels (e.g., in given a NISQ device). Embodiments also enable NISQ devices to perform amplitude estimation faster than amplitude estimation algorithms performed using a classical (non-quantum) computer.

Some embodiments relate to two new low depth methods for performing amplitude estimation (AE). These methods achieve a tradeoff between the quantum speedup and quantum circuit depth. For example, for β∈(0,1], the algorithms use

$N = {O\left( \frac{1}{\epsilon^{1 + \beta}} \right)}$

oracle calls and sequentially execute the oracle

$D = {O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}$

times to perform amplitude estimation within additive error ϵ. These methods interpolate between a classical algorithm (β=1) and the standard quantum algorithm (β=0) and achieve a tradeoff ND=O(1/ϵ²). These methods bring quantum speedups for Monte Carlo methods closer to realization, as they can provide speedups with shallower circuits.

The first method (Power law AE) uses power law schedules. The methods works for β∈(0,1] and some embodiments have provable correctness when the log-likelihood function satisfies regularity conditions specified by the Bernstein Von-Mises theorem.

The second methods (QoPrime AE) uses the Chinese remainder theorem for combining lower depth estimates to achieve higher accuracy. The method works for discrete β=q/k, where k≥2 is the number of distinct coprime moduli used by the method and 1≤q≤k−1. Some embodiments have a fully rigorous correctness proof.

Additionally, this disclosure analyzes both methods in the presence of depolarizing noise and provides numerical comparisons with other amplitude estimation algorithms.

Other aspects include components, devices, systems, improvements, methods, processes, applications, non-transitory computer readable mediums, and other technologies related to any of the above.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the disclosure have other advantages and features which will be more readily apparent from the following detailed description and the appended claims, when taken in conjunction with the examples in the accompanying drawings, in which:

FIG. 1 plots a log-likelihood function for a power law amplitude estimation algorithm.

FIG. 2 illustrates a plot of the convergence of a coprime finding routine.

FIG. 3 illustrates a plot of the behavior of the oracle call dependency in equation 21 for the Qoprime algorithm.

FIG. 4 illustrates a plot of the transition from coherent to noise dominated as measured by the instantaneous exponent

$\frac{d\;\log\;\mathcal{N}}{d\;\log\;\epsilon},$

where N is the number oracle calls optimized over parameters k and q.

FIG. 5 illustrates a plot of the trajectory of optimal estimates for k, q parameters.

FIG. 6 illustrates a plot of the performance of the Power Law algorithm in theory (solid lines) and practice (dots and triangles) using the schedule

$m_{k} = \left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$

for values β=0.455 and β=0.714, for different error rates ϵ.

FIG. 7 illustrates a plot of the performance of the Power Law algorithm in theory (solid lines) and practice (dots, triangles, and squares) using power law schedules where the parameter β is improved (e.g., optimized) for given ϵ and γ.

FIG. 8 illustrates a plot of the performance of the QoPrime algorithm under several noise levels (γ).

FIG. 9 illustrates a plot of the maximum oracle depth as a function of the target precision ϵ.

FIG. 10 illustrates a plot of empirical values for the constant prefactor C, as described in equation (21).

FIG. 11 illustrates a plot the performances of the Power Law algorithm, the QoPrime algorithm, and the Iterative Quantum Amplitude Estimation (IQAE) algorithm for the noise level of γ=10⁻³.

FIG. 12 is a flow chart illustrating a first method of operating a quantum computer to determine θ of a unitary operator U within an error ϵ.

FIG. 13 is a flow chart illustrating a second method of operating a quantum computer to determine θ of a unitary operator U within an error ϵ.

FIG. 14A is a block diagram that illustrates an embodiment of a computing system.

FIGS. 14B-14C are block diagrams that illustrate embodiments of a quantum computing system.

FIG. 14D is a flow chart that illustrates an example execution of a quantum routine on the computing system.

FIG. 15 is an example architecture of a classical computing system.

DETAILED DESCRIPTION

The figures and the following description relate to preferred embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of what is claimed.

1. Introduction

Amplitude estimation (AE) describes quantum algorithms that allow a quantum computing system to estimate the amplitude

0|U|0

for a quantum circuit U to additive error ϵ. U may also be referred to as a unitary operator. A unitary operator may perform a unitary operation (e.g., when executed by a quantum computing system). The number of calls to U depends on the type of amplitude estimation algorithm. In some cases, the number of calls to U scales with O(1/ϵ). Thus, these algorithms may offer a quadratic advantage over classical sampling (which scales as O(1/ϵ²)) and have many applications including speedups for Monte Carlo methods and approximate counting.

Generally, amplitude estimation includes a quantum circuit U applied to a register including t qubits, such that U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

, where |x

and |x′

are arbitrary states on (t−1) qubits and |0

and |1

are the two possible states of the remaining qubit (this may be referred to as the “last” qubit). More generally, U may affect more than one qubit and the resulting state of interest may be referred to as the marked state. The goal is to estimate the amplitude θ within an additive error ϵ. The related approximate counting problem corresponds to the special case where

${U\left. 0^{t} \right\rangle} = {\frac{1}{2^{{({t - 1})}/2}}\left( {{\sum_{i \in S}{\left. i \right\rangle\left. 0 \right\rangle}} + {\sum_{i \notin S}{\left. i \right\rangle\left. 1 \right\rangle}}} \right)}$

is a uniform superposition (where all amplitudes are the same) over bit strings of length (t−1) and the binary label on the last qubit is the indicator function for S⊆{0,1}^(t-1). Amplitude estimation in this setting provides an estimate for |S|. Approximate counting in turn generalizes Grover's search and the problem of finding the number of marked elements in a list.

This disclosure discusses two applications of amplitude estimation: (1) to quantum Monte Carlo methods and (2) inner product estimation. These applications may be relevant for finance calculations, optimization, machine learning, linear algebra computations.

Quantum Monte Carlo methods are an application of amplitude estimation to the problem of estimating the mean of a real valued random variable by sampling. Let ƒ(x, S) be a real valued function where x is the input and S is the random seed and let a be the variance of ƒ(x, S). Classically estimating the mean E_(s)ƒ(x, S) to additive error E is known to require N=O(σ²/ϵ²) samples. In contrast, quantum amplitude estimation can be used to estimate the mean using

$N = {\overset{\sim}{O}\left( \frac{\sigma}{\epsilon} \right)}$

samples, thus quadratically improving the dependence on σ and 1/ϵ. For example, in a finance calculation, one often needs accuracy of ϵ=0.001 which implies that the number of samples can be reduced by a factor 1000× when using a quantum algorithm.

Amplitude estimation also has applications which are not reducible to approximate counting, for example, where the goal is to estimate

0|U|0

for a general unitary operator U. Some examples of this kind include applications of amplitude estimation to quantum linear algebra and machine learning. For example, quantum procedures for estimating the inner product

x|y

between vectors x, y∈

^(n) may be useful for quantum classification and clustering algorithms. Amplitude estimation may be used to achieve quantum speedup for this inner product estimation. For example, amplitude estimation may use O(1/ϵ) samples as opposed to the O(1/ϵ²) samples, which is the number of samples used classically for estimating the inner product to error ϵ. Amplitude estimation variants may also be beneficial for reducing the condition number dependence K in the linear system solvers from O(κ²) to O(κ). This reduces the running time and also the depth of the linear system solver quantum circuit by a factor of κ, which in many cases is more than a 100×.

Some amplitude estimation algorithms (e.g., referred to as standard amplitude estimation algorithms) invoke a controlled version of the quantum circuit U at least O(1/ϵ) times in series. Afterwards a quantum Fourier transform is performed to estimate amplitudes to error ϵ (e.g., using a phase estimation algorithm). This makes applications of amplitude estimation, like Monte Carlo methods or approximate counting, prohibitive for near term hardware because, even in cases where the oracle itself does not have significant depth, the high number of oracle calls in series makes the total depth of the circuits prohibitive for the high noise rates of current noisy intermediate scale quantum (NISQ) devices. In summary, depending on the depth of the circuit U and the number of sequential executions, NISQ devices may be incapable of running amplitude estimation algorithms. Thus, it may be advantageous to reduce the resources used for amplitude estimation algorithms so that they can be executed by NISQ devices.

Table 1 lists several amplitude estimation algorithms along with the resources used. Specifically, Table 1 illustrates asymptotic tradeoffs for various amplitude estimation algorithms. In the table, n is the number of qubits, d is the circuit depth for a single application of U, ϵ is the additive error, β∈(0,1], k≥2, and q∈[k−1]. The approximate counting algorithm is included for reference, but it requires extra structure and is not strictly an amplitude estimation algorithm.

Algorithm Qubits Depth Number of Calls Standard Amplitude Estimation n + log(1/ϵ) ${d \cdot \frac{1}{\epsilon}} + {\log\;{\log\left( {1/\epsilon} \right)}}$ $\frac{1}{\epsilon}$ QFT Free Amplitude Estimation n $d \cdot \frac{1}{\epsilon}$ $\frac{1}{\epsilon}$ Approximate Counting n ${d \cdot \left( \frac{1}{\epsilon} \right)^{1 - \beta}}{\log\left( {1/\epsilon} \right)}$ $\left( \frac{1}{\epsilon} \right)^{1 + \beta}{\log\left( {1/\epsilon} \right)}$ IQAE n $d \cdot \frac{1}{\epsilon}$ $\frac{1}{\epsilon}$ Power-Law Amplitude Estimation n $d \cdot \left( \frac{1}{\epsilon} \right)^{1 - \beta}$ $\left( \frac{1}{\epsilon} \right)^{1 + \beta}$ QoPrime Amplitude Estimation n $d \cdot \left( \frac{1}{\epsilon} \right)^{1 - {q/k}}$ $\left( \frac{1}{\epsilon} \right)^{1 + {q/k}}$

The Power-Law and QoPrime algorithms are described in this disclosure. In contrast to the other algorithms in Table 1, these algorithms may be executed by NISQ devices. These algorithms were motivated by the following question: can quantum amplitude estimation algorithms that use quantum circuits with depth (e.g., asymptotically) less than O(1/ϵ) provide any speed up compared to classical algorithms (which scale as O(1/ϵ²))? Thus, this disclosure focuses on algorithms that interpolate between classical and quantum amplitude estimation. At a high level, amplitude estimation with a classical computer uses O(1/ϵ²) applications of the oracle, but these applications may be performed in parallel. The standard quantum amplitude estimation algorithm on the other hand uses O(1/ϵ) serial applications of the oracle, but this computation may be performed once. This disclosure presents two new and different algorithms that interpolate between the classical and quantum settings with a tradeoff of ND=O(1/ϵ²), where N is the total number of oracle calls and D is the maximum number of sequential oracle calls that occurs during any single run of the quantum computing system.

As used herein, sequential or serial oracle calls refers to executing U “back-to-back” a number of times to generate a quantum state. This may be performed by generating a quantum circuit that includes U a number of times in a sequence. The quantum computing system may then be instructed to execute the circuit. The quantum state may then be measured afterwards.

The Power law Amplitude Estimation Algorithm

Our first algorithm utilizes the framework proposed by Suzuki et al. for quantum Fourier transform (QFT) free amplitude estimation (Y. Suzuki, S. Uno, R. Raymond, T. Tanaka, T. Onodera, and N. Yamamoto, “Amplitude estimation without phase estimation,” Quantum Information Processing, vol. 19, no. 2, p. 75, 2020). In the Suzuki et al. framework, the oracle is invoked with varying depths, and then measurements in the standard basis are performed, followed by classical maximum likelihood post-processing to estimate the amplitude. Algorithms in this framework are specified as schedules (m_(k), N_(k)) where the oracle is applied m_(k) times in series for N_(k) iterations, and, at the end, the results are post-processed classically using maximum likelihood estimation.

Specifically, Suzuki describes a QFT free amplitude estimation algorithm that uses an exponential schedule with depths m_(k)=2^(k) all the way up to the maximum depth of O(1/ϵ) and chooses N_(k)=N_(shot) to be a constant. The quantum Fourier transform step at the end of the algorithm is eliminated. However, the asymptotic efficiency of max-likelihood post-processing is not established rigorously.

The power law amplitude estimation algorithm (Power law AE) of this disclosure uses power law schedules with constant N_(k)=N_(shot) and

${m_{k} = \left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor},$

where k starts from 1 and increases until the maximum depth for the quantum circuit is O(1/ϵ^(1-β)) for β∈(0,1] (“(“is exclusive and “]” is inclusive), at a cost of more parallel runs with the total number of oracle calls scaling as O(1/ϵ^(1+β)). When β tends to 0, the schedule approaches the exponential schedule, while when β is equal to 1, the algorithm is the one where we only sample directly from the oracle as in the case of sampling with a classical computer. The analysis of the power law amplitude estimation is based on the observation that maximum likelihood estimation in this setting is equivalent to sub-dividing the domain for the amplitude θ into O(1/ϵ) equal parts and performing Bayesian updates starting from a uniform prior probability distribution, where every value has equal probability of being the correct one. In some embodiments, if the prior and the log-likelihood function (described later) are sufficiently regular, the Bernstein Von-Mises theorem can be used (also described later), which can be viewed as a Bayesian central limit theorem that quantifies the rate of convergence to the Bayesian estimator to the normal distribution centered at the true value of θ with variance 1/(N_(shot)I_(ƒ)(α)), where α=cos²(θ). The variant of the Bernstein Von-Mises theorem may be helpful for the analysis as it bounds the rate of convergence of the posterior distribution to the normal distribution in the

₁ norm. The tradeoff ND=O(1/ϵ²) follows from Fisher information calculations for the power law schedules.

This disclosure also studies the behavior of the power law algorithm in the presence of (e.g., depolarizing) noise and describes a method to make the power law algorithm robust to noise. In fact, this disclosure provides a method for choosing the (e.g., optimal) parameter β given a desired accuracy and noise level. One can view the power law algorithm as a (e.g., optimal or improved) method of utilizing the power of the available quantum circuit in terms of depth. Said differently, instead of waiting for quantum circuits to have good enough fidelity to apply sequentially a number of oracle calls of the order of 1/ϵ, (which, for Monte Carlo applications, can grow between 10³ to 10⁶) the power-law amplitude estimation algorithm may use quantum circuits of less depth (e.g., any depth provided by a NISQ device) and provide a corresponding speedup.

Aspects of the analysis described above assumes regularity conditions on the log-likelihood and the prior for the Bernstein Von-Mises theorem. These regularity conditions may be difficult to verify, even though they seem to hold empirically for the log-likelihood function for amplitude estimation. It therefore may be desirable to have an amplitude estimation algorithm achieving the same ND=O(1/ϵ²) tradeoffs that does not rely on these conditions. Thus, we determined a schedule that increases (e.g., maximizes) the Fisher information for a given number of oracle calls. The (e.g., optimal) determined solution may be a schedule that makes oracle calls at the maximal possible depth. However, making oracle calls only at a particular depth may not be sufficient for amplitude estimation due to the periodicity of the function cos²((2k+1)θ), meaning it may be difficult (or impossible) to differentiate between different values of 0 that result to the same value of cos²((2k+1)θ). This led us to consider amplitude estimation algorithms that execute circuits at two (or more) different depths and combine the results, leading to the QoPrime amplitude estimation algorithm.

The QoPrime Amplitude Estimation algorithm

Our second amplitude estimation algorithm (QoPrime AE) uses a number theoretic approach to amplitude estimation that may enable a fully rigorous correctness proof and may have a similar or same depth vs. number of oracle calls tradeoff (ND=O(1/ϵ²)) as the power-law amplitude estimation for a set of exponents that take discrete values between 0 and 1.

The idea for the QoPrime amplitude estimation algorithm is to choose k different co-prime moduli, each close to O(1/ϵ^(1/k)), so their product is N=O(1/ϵ), where N is the total number of oracle calls. The true value of the amplitude θ may be defined to be πM/2N where M∈[0, N]. The algorithm estimates └[M┘ mod N_(i), where N_(i) is the product of q out of the k moduli using N/N_(i) sequential calls to the oracle followed by measurements (e.g., in the standard basis). These low accuracy estimates of the amplitude are then combined using the Chinese remainder theorem to obtain └M┘ mod N while the fractional (non-integer) part of M is estimated separately. As an example, the QoPrime algorithm is explained for the simplest case of k=2 moduli. this case corresponds to an algorithm with D=O(1/ϵ^(1/2)) and N=O(1/ϵ^(3/2)), where N is the total number of oracle calls and D is the maximum number of sequential oracle calls. Let a and b be the two largest co-prime numbers less than the depth D=O(1/√{square root over (ϵ)}). For simplicity, assume that the true value for θ is of the form

$\frac{\pi\; M}{2\left( {{2a} + 1} \right)\left( {{2b} + 1} \right)}$

for some integer M∈[0,(2a+1)(2b+1)]. The QoPrime algorithm thus determines M mod(2a+1) and M mod(2b+1) and then determines M using the Chinese remainder theorem.

Invoking the oracle a times in series followed by a measurement (e.g., in the standard basis) is equivalent or similar to sampling from a Bernoulli random variable with a success probability of cos²((2a+1)θ). As a function of θ, this probability is periodic with period

$\frac{\pi}{{2a} + 1}$

and is therefore a function of ±M mod(2b+1). Subdividing the interval [0, π/2] into (2b+1) equal parts, it follows from the additive form of the Chernoff bounds that ±M mod(2b+1) can be recovered with O((2b+1)²) samples. The algorithm estimates M mod(2b+1) with O(2b+1)² repetitions of a quantum circuit of depth (2a+1) and, similarly, M mod(2a+1) with O(2a+1)² repetitions of a quantum circuit of depth (2b+1). The Chinese remainder theorem is then used to combine these low precision estimates to obtain the integer M∈[0, (2a+1)(2b+1)], thus boosting the precision for the estimation procedure. The total number of oracle calls made was O(ab²+ba²)=O(1/ϵ^(1.5)). The maximum depth for the oracle call was O(1/ϵ^(1/2)). The procedure can be extended to the more general case where the true value for θ is

$\frac{\pi\; M}{2N}$

where N is the product of q≤K coprime moduli and M∈[0, N]. In this case, this disclosure also shows how to pick these values q, k. Here, the maximum depth of the quantum circuit is D=O(1/ϵ^(1−q/k)), and the total number of oracle calls are N=O(1/ϵ^(1+q/k)).

Additionally, the accuracy of the algorithm in the presence of depolarizing noise is considered. Specifically, this disclosure provides a number of graphs to show the behavior under noise. The analysis of the algorithm in a noisy setting shows that noise may limit the depth of the oracle calls that can be made, but it also allows us to choose (e.g., optimally) the algorithm parameters to reduce (e.g., minimize) the number of oracle calls for a given target approximation error and noise rate. The experiments show that the constant overhead for the QoPrime algorithm is reasonable (C<10) for most settings of interest (explained further below).

This disclosure also benchmarks the two new low depth amplitude estimation algorithms with a IQAE algorithm in noisy settings (Section 4.3). For reference, IQAE is described in D. Grinko, J. Gacon, C. Zoufal, and S. Woerner, “Iterative quantum amplitude estimation,” arXiv preprint arXiv:1912.05559, 2019. Algorithms such as IQAE require access to a full circuit depth of O(1/ϵ), and this large depth is exponentially penalized by the depolarizing noise by requiring an exponentially large number of runs of the quantum circuit to achieve a precision below the noise level. In comparison, the power law and the QoPrime amplitude estimation algorithms transition smoothly to a classical estimation scaling and do not suffer from an exponential growth in oracle calls. The Power law amplitude estimation algorithm shows the best practical performance according to the simulations for different error rates and noise levels.

Overall, this disclosure presents two low depth algorithms for quantum amplitude estimation, thus potentially bringing a number of applications closer to the NISQ era. Specifically, the tradeoff between the total number of oracle calls and the depth of the quantum circuit that is offered by these algorithms may be a powerful tool towards finding quantum applications for near and intermediate term devices.

The disclosure is organized as follows: In Section 2, we describe and analyze the power law amplitude estimation algorithm, while in Section 3, we describe and analyze the QoPrime amplitude estimation algorithm. In Section 4, we present empirical evidence of the performance of our algorithm and benchmarks with other state-of-the-art algorithms for amplitude estimation.

2. Amplitude Estimation with Power Law Schedules

2.1 Preliminaries

In this section, we introduce some preliminaries for the analysis of amplitude estimation with power law schedules. Let X be a random variable with density function ƒ(X, α) that is determined by the single unknown parameter α. Let l(X, α)=log ƒ(X, α) be the log-density function and let

${l^{\prime}\left( {x,\alpha} \right)} = {\frac{\partial{l\left( {x,\alpha} \right)}}{\partial\alpha}.}$

In this section, all expectations are with respect to the density function ƒ(X, α) and ′ denotes the partial derivative with respect to α.

The Fisher information I_(ƒ)(α) is defined as the variance of the log-likelihood function, that is I_(ƒ)(α)=Var[l′(X, α)]. It can also be equivalently defined as I_(ƒ) (α)=−E_(ƒ)[l″(X, α)]. More generally, for parameters α∈

^(n), the Fisher information is defined as the covariance matrix of the second partial derivatives of l(X, α) with respect to α.

Let α* be the true value for α and consider a Bayesian setting where a prior distribution on α is updated given i.i.d. (Independent and Identically Distributed Random Variables) samples X_(i), i∈[n] from a distribution ƒ(X, α*). The Bernstein-Von Mises theorem (stated below) quantifies the rate of convergence of the Bayesian estimator to the normal distribution with mean and variance

$\left( {\alpha^{*},\frac{1}{{nI}_{f}\left( \alpha^{*} \right)}} \right)$

in the

₁ norm for cases where the log-likelihood and the prior distribution are sufficiently regular. The complete list of regularity conditions for the theorem is given in Section 5.

Theorem 2.1 (Bernstein Von-Mises Theorem): Let X_(i), i∈[n] be independent samples from a distribution ƒ(X, α*) and let R₀ be the prior distribution on α. Let R_(n) be the posterior distribution after n samples and let Q_(n,α*) be the Gaussian with mean and variance

$\left( {\alpha^{*},\frac{1}{{nI}_{f}\left( \alpha^{*} \right)}} \right).$

If ƒ(X, α*) and R₀ satisfy the regularity conditions enumerated in the Section 5, then there exists a constant c>0 such that,

$\begin{matrix} {{\begin{matrix} \Pr \\ {{X_{i} \sim f},{i \in \lbrack n\rbrack}} \end{matrix}\left\lbrack {{{R_{n} - Q_{n,\alpha^{*}}}}_{1} \geq {c\sqrt{\frac{1}{n}}}} \right\rbrack} = {O\left( \frac{1}{\sqrt{n}} \right)}} & (1) \end{matrix}$

As we have defined before, the amplitude estimation algorithm has access to a quantum circuit U such that U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

where |x

and |x′

are arbitrary states on (t−1) qubits. If the second register is measured (e.g., in the standard basis), then the distribution of the measurement outcome ƒ(X, α) may be a Bernoulli random variable with success probability α=cos²(θ)∈[0,1]. Thus, if the quantum computing system executes a circuit that includes k sequential calls to the circuit U, then the quantum state cos((2k+1)θ)|x, 0

+sin((2k+1)θ)|x′, 1

may be generated. In light of this, measuring this generated state (e.g., in the standard basis), the distribution of the measurement outcome may be a Bernoulli random variable with success probability cos²((2k+1)θ).

A quantum amplitude estimation algorithm therefore has access to samples from Bernoulli random variables with success probability cos²((2k+1)θ) where k is the number of sequential oracle calls in the quantum circuit, which corresponds to its depth. The higher depth samples (said differently, circuits with more sequential oracle calls) may be more informative for estimating θ. An example may help explain this: when we are sampling from a random variable with success probability cos²(3θ), with O(1/ϵ) samples we may get ϵ close to 3θ, which means we get ϵ/3 close to θ thus getting a better accuracy. The next proposition quantifies the advantage for higher depth samples, showing that the Fisher information may grow quadratically with the number of sequential oracle calls m_(k).

Proposition 2.2: Let ƒ(X, α)=β^(X)(1−β)^(1-X) for parameter α=cos²(θ) and β=cos²((2m_(k)+1)θ) for a positive integer m_(k). The Fisher information is

${I_{f}(\alpha)} = {\frac{\left( {{2m_{k}} + 1} \right)^{2}}{\alpha\left( {1 - \alpha} \right)}.}$

Proof: As α=cos²(θ) we have dα=2 cos(θ)sin(θ)dθ. The log-likelihood function is l(X, α)=X log β+(1−X)log(1−β). Thus,

$\begin{matrix} {{I_{f}(\alpha)} = {- {E_{f}\left\lbrack {\frac{d}{d\;\alpha^{2}}\left( {{2X\;\log\mspace{11mu}{\cos\left( {\left( {{2m_{k}} + 1} \right)\theta} \right)}} + {2\left( {1 - X} \right)\log\mspace{11mu}{\sin\left( {\left( {{2m_{k}} + 1} \right)\theta} \right)}}} \right)} \right\rbrack}}} \\ {= {\frac{- 1}{2{\alpha\left( {1 - \alpha} \right)}}{E_{f}\left\lbrack {\frac{d}{d\;\theta^{2}}\begin{pmatrix} {{X\mspace{11mu}\log\mspace{14mu}{\cos\left( {\left( {{2m_{k}} + 1} \right)\theta} \right)}} +} \\ {\left( {1 - X} \right)\log\mspace{11mu}{\sin\left( {\left( {{2m_{k}} + 1} \right)\theta} \right)}} \end{pmatrix}} \right\rbrack}}} \\ {= {\frac{\left( {{2m_{k}} + 1} \right)^{2}}{2{\alpha\left( {1 - \alpha} \right)}}\left( {\frac{e_{f}\lbrack X\rbrack}{\cos^{2}\left( {\left( {{2m_{k}} + 1} \right)\theta} \right)} + \frac{E_{f}\left\lbrack {1 - X} \right\rbrack}{\sin^{2}\left( {\left( {{2m_{k}} + 1} \right)\theta} \right)}} \right)}} \\ {= {\frac{\left( {{2m_{k}} + 1} \right)^{2}}{2{\alpha\left( {1 - \alpha} \right)}}.}} \end{matrix}$

The Fisher information is defined as a variance and is therefore additive over independent samples that do not need to be identically distributed. The Fisher information of an amplitude estimation schedule (m_(k), N_(k)) is the sum of the Fisher informations for the individual samples. End of proof for Proposition 2.2.

2.2 The Power Law Amplitude Estimation Algorithm

An embodiment of the amplitude estimation algorithm using power law schedules is given as Algorithm 2.1 below. It is then analyzed to establish the tradeoff between the depth and the total number of oracle calls in a setting where the Bernstein Von Mises Theorem is applicable.

Algorithm 2.1 (the Power Law Amplitude Estimation Algorithm):

Goal: For a unitary gate U that performs U|0

=cos(θ)|x, 0

+sin(θ)|x′, 1

and for parameters β∈(0,1] and N_(shot)∈Z, estimate of θ within accuracy ϵ (e.g., with high probability).

Step 1: Initialize the prior to be a uniform distribution on angles

${\theta = \frac{{\pi\; t} \in}{2}}.$

Step 2: For k=1 to

$K = {\max\left( {\frac{1}{\epsilon^{2\beta}},\ {\log\left( {1/\epsilon} \right)}} \right)}$

do:

-   -   a. Step 3: Initialize N_(k) ₀ =0 and N_(k) ₁ =0.     -   b. Step 4: For i=1 to N_(shots) do:         -   i. Step 5: Apply

$m_{k} = \left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$

-   -   -   sequential oracle calls and measure the last qubit of the             resulting quantum state (e.g., in the standard basis).         -   ii. Step 6: If the outcome is 0, then N_(k) ₀ =N_(k) ₀ +1,             else N_(k) ₁ =N_(k) ₁ +1.

    -   c. Step 7: end for.

    -   d. Step 8: Perform Bayesian updates         p(θ)→p(θ)cos((2m_(k)+1)θ)^(N) ^(k0) sin((2m_(k)+1)θ)^(N) ^(k1)         for θ=πtϵ/2 for integer valued t∈[0,1/ϵ] and interpolate to         obtain a posterior probability distribution.

Step 9: end for.

Step 10: Output θ with a (e.g., the highest) probability according to the posterior probability distribution.

End of Algorithm 2.1.

The next theorem (below) shows that Algorithm 2.1 achieves approximation error ϵ with parameters

${N = {{{O\left( \frac{1}{\epsilon^{1 + \beta}} \right)}\mspace{14mu}{and}\mspace{20mu} D} = {O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}}},$

where D is the maximum number of sequential oracle calls (also referred to as the maximum circuit depth) and N is the total number of oracle calls made. The choice of

$K = {\max\left( {\frac{1}{\epsilon^{2\beta}},{\log\left( {1/\epsilon} \right)}} \right)}$

results in the power law AE algorithm making at least log(1/ϵ) oracle calls for small and approaches the exponential schedule m_(k)=2^(k) if β→0.

Theorem 2.3: the power law amplitude estimation algorithm 2.1 outputs an ϵ accurate estimate with

$N = {O\left( \frac{1}{\epsilon^{1 + \beta}} \right)}$

oracle calls and maximum circuit depth

$D = {O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}$

with probability at least 0.9, that is the algorithm attains the tradeoff

${ND} = {O\left( \frac{1}{\epsilon^{2}} \right)}$

in settings where the Bernstein-Von Mises theorem is applicable.

Proof: The total number of oracle calls for Algorithm 2.1 is N=Σ_(k∈[K])N_(shot)(2m_(k)+1) while the Fisher information for the power law schedule can be computed as

${I_{f}(\alpha)} = {\frac{N_{shot}}{\alpha\left( {1 - \alpha} \right)}{\Sigma_{k \in {\lbrack K\rbrack}}\left( {{2m_{k}} + 1} \right)}^{2}}$

using proposition 2.2. Approximating the sums in N and I_(ƒ)(α) by the corresponding integrals (where we replace the sum in the definition with an integral) gives I_(ƒ)(α)=O(K^(1/β)), N=O(K^((1+β)/2β)) and maximum depth D=O(K^((1-β)/2β)), so that I_(ƒ)(α)=O(ND). Note that for

${K = {\max\left( {\frac{1}{\epsilon^{2\beta}},{\log\left( {1/\epsilon} \right)}} \right)}},$

we nay

$N = {{{O\left( \frac{1}{\epsilon^{1 + \beta}} \right)}\mspace{14mu}{and}\mspace{20mu} D} = {{{O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}\mspace{20mu}{and}\mspace{20mu}{I_{f}(\alpha)}} = {{O\left( \frac{1}{\epsilon^{2}} \right)}.}}}$

Next, we show that with probability at least 0.9, the estimate output by the algorithm is within additive error E of the true value. Applying the Bernstein Von-Mises Theorem, the

₁ distance between the posterior distribution and the Gaussian with mean and variance (α*, 1/√{square root over (N_(shot)I_(ƒ)(α*))}) is at most c/√{square root over (N_(shot))} with probability at least 1−c′/√{square root over (N_(shot))} for some constants c, c′>0.

Choosing

$N_{shot} > \left( \frac{\max\left( {c,c^{\prime}} \right)}{\delta} \right)^{2}$

for δ=0.05, the estimate is within

$\left( {\alpha^{*} \pm \frac{3\delta}{c\sqrt{I_{f}\left( \alpha^{*} \right)}}} \right)$

with probability at least 1−δ(1+1)−0.0013≥0.89. The success probability can be boosted to

$1 - \frac{1}{pol{y(\zeta)}}$

for ζ>0 by running the power law algorithm O(log(1/ζ) times and outputting the most frequent estimate.

The above proof analyzes Algorithm 2.1 in settings where the Bernstein Von Mises theorem is applicable. The complete list of regularity conditions for the theorem are enumerated in Section 5. In high level, for some neighborhood around the real value of θ the regularity conditions impose: the smoothness of the prior distribution; the smoothness of the density function ƒ(X, θ), which will be satisfied if the norm of log-likelihood is bounded around θ; and the smoothness and differentiability of the Fisher information around θ, which will be true if the log-likelihood function has derivatives of order at least 3.

FIG. 1 plots the log-likelihood function for the power law AE algorithm for a fixed exponent β and varying depths (k) and a true value for θ chosen uniformly at random from [0, π/2]. The figure illustrates that the log-likelihood function is smooth in a neighborhood around the true value of θ indicating that the regularity conditions for the Bernstein-Von Mises theorem may be plausible in this setting. Adding noise may further regularize the log-likelihood functions and enforce the regularity conditions for the Bernstein-Von Mises theorem.

The constants c, c′ in the proof of Theorem 2.3 may be determined explicitly if the regularity conditions are verified giving an explicit value for N_(shot). In the absence of an explicit value, if experimental results demonstrate convergence for a small value of N_(shot), may be taken as evidence that N_(shot) is a moderately small constant.

2.3 Power Law Amplitude Estimation with Noise

Noise provides a natural constraint on accessible circuit depths and noise models exponentially penalize larger depths, leading to an exponential decoherence of quantum states. Therefore, exponentially more classical samples may be needed to battle the noisy information (depending on the algorithm). In this section, we explain this effect on the power law algorithm in the case of a depolarizing noise model.

Proposition 2.4: Assuming a depolarizing noise channel with a per-oracle-call rate of γ≥0, if measurements (e.g., in the standard basis) are performed on a quantum circuit of k sequential calls to the oracle U, the distribution of measurement outcomes is a Bernoulli random variable with success probability

$\begin{matrix} {{p = {\frac{1}{2} - {\frac{1}{2}e^{{- \gamma}\; k}{\cos\left( {2\left( {{2k} + 1} \right)\theta} \right)}}}}.} & (2) \end{matrix}$

Let {m_(k)}_(k=0, . . . , K) be a schedule. The Fisher information in the presence of depolarizing noise with parameter γ is given by

$\begin{matrix} {{I_{f}(\alpha)} = {\frac{N_{shot}}{\sin^{2}\left( {2\theta} \right)}{\sum\limits_{k = 0}^{K}\;{\left( {{2m_{k}} + 1} \right)^{2}{\frac{4e^{{- 2}\gamma m_{k}}{\sin^{2}\left( {2\left( {{2m_{k}} + 1} \right)\theta} \right)}}{1 - {e^{{- 2}\gamma m_{k}}{\cos^{2}\left( {2\left( {{2m_{k}} + 1} \right)\theta} \right)}}}.}}}}} & (3) \end{matrix}$

For more information, see T. Tanaka, Y. Suzuki, S. Uno, R. Raymond, T. Onodera, and N. Yamamoto, “Amplitude estimation via maximum likelihood on noisy quantum computer,” arXiv preprint arXiv:2006.16223, 2020. Recall that α=cos²(θ) is the probability of success without depolarizing noise. For k such that γm_(k) becomes big enough so the term e^(−2γm) ^(k) becomes very close to zero, the second term in the sum will be exponentially suppressed. Thus, the Fisher information will not increase significantly even if the circuit depth increases (said differently, if the number of sequential oracle calls increases).

Consider the power law schedule given by m_(k)=└k┘ for k=0, 1, . . . , K for

$\eta = \frac{1 - \beta}{2\beta}$

and β∈(0,1], as described in Algorithm 2.1. We see that

${I_{f}(\alpha)} \leq {{\sum\limits_{{\gamma{\lfloor{k^{⩓}\eta}\rfloor}} \leq 1}{\left( {{2\left\lfloor {k\bigwedge\eta} \right\rfloor} + 1} \right)^{2}C}} + {{exponentially}\mspace{14mu}{suppressed}}}$ ${{terms} \leq {10C{\underset{k = 0}{\sum\limits^{{({1/\gamma})}^{1/\eta}}}\left( {{2k^{\eta}} + 1} \right)^{2}}} \leq {C^{\prime}\text{/}\gamma^{2 + {1/\eta}}}} = {C^{\prime}\text{/}{\gamma^{2/{({1 - \beta})}}.}}$

Here C is a constant that can be taken to be

$C:=\frac{4N_{shot}}{\sin^{4}\left( {2\theta} \right)}$

(if this diverges, add small random noise to θ). So, by the Cramer-Rao bound,

$\begin{matrix} {\left. {{I_{f}(\alpha)} \leq \frac{c^{\prime}}{\gamma^{2/{({1 - \beta})}}}}\Rightarrow{\epsilon \geq {c\;\gamma^{1/{({1 - \beta})}}}} \right.,} & (4) \end{matrix}$

where c:=C′^(−1/2). This implies that given a noise level γ we cannot get an error rate ϵ smaller than cγ^(1/(1-β)) by increasing the depth for a power law schedule with a fixed parameter β. Said differently, given a desired error rate ϵ and a noise level γ, we can determine the parameter β to match the lower bound in equation 4. In this case, it is assumed that the regularity conditions of Bernstein-von Mises also hold in the noisy setting so that the lower bound can be matched on ϵ.

If the noise level is smaller than the desired error rate, then the exponential schedule (for β equal to 0) may be used since circuits of depth up to O(1/ϵ) may be applied. For ϵ<γ, we assume that we can match the lower bound in equation 4 so that β can be determined as follows.

Proposition 2.5: Assume a target error E and noise level γ with 0<ϵ<γ<1 and that the regularity conditions of the Bernstein-von Mises hold in the noisy setting. The parameter β of the power law algorithm may be determined according to

$\beta \geq {1 - \frac{\log\;\gamma}{\log\;\epsilon\text{/}c}}$

to achieve ϵ≤cγ^(1/(1-β)) for the power law schedule m_(k)=└k^(η)┘ for k=0, 1, . . . , K with parameter

$\eta = \frac{1 - \beta}{2\beta}$

(with high probability over the randomness of the algorithm as previously discussed).

3. QoPrime: A Number Theoretic Amplitude Estimation Algorithm

3.1 Preliminaries

This section introduces technical tools used for the QoPrime AE algorithm. In particular, this section describes the Chinese remainder theorem and the additive form of the Chernoff bounds.

Theorem 3.1 (The Chinese Remainder Theorem): Let a_(i)∈

, i∈[0, k] be pairwise coprime numbers such that for all i≠j, gcd(a_(i),a_(j))=1 and let N=Π_(i∈[k])a_(i). Then for all b_(i)∈[a_(i)],i∈[k] there is a polynomial time algorithm to find M∈[N] such that M mod a_(i)=b_(i).

Proof: First describe the proof for k=2. By applying the extended Euclidean algorithm, we can find u₁, u₂∈

such that,

1=gcd(a ₁ ,a ₂)=u ₁ a ₁ +u ₂ a ₂,  (5)

Then M=(b₂u₁a₁+b₁u₂a₂)mod a₁a₂ satisfies M=b_(i) mod a_(i) for i=1,2.

For the proof of the general case, note that the k−1 numbers a₁, a₂, a₃, a₄, . . . , a_(k) are coprime and, by the above argument, the constraints M=b_(i) mod a_(i) for i=1,2 is equivalent to M=(b₂u₁a₁+b₁u₂a₂) mod a₁ a₂. The procedure may therefore be repeated iteratively to find the desired M. End of proof for Theorem 3.1.

The QoPrime algorithms in this disclosure use relatively coprime moduli a_(i), however the results may be adapted to a setting where the a_(i) are not pairwise coprime. For example, this may be done by replacing N=Π_(i∈[k])a₁ with the least common multiple of the a_(i).

Another tool used for the QoPrime algorithm is the additive form of the Chernoff bound and some supplementary calculations on the entropy of the binomial distribution.

Theorem 3.2 (Chernoff-Hoeffding Bound): Let X_(i) for i∈[m] be i.i.d. random variables such that X_(i)∈{0,1} with expectation

[X_(i)]=p and let ϵ>0 and

$X = {\frac{1}{m}{\sum_{i \in {\lbrack m\rbrack}}{X_{i}.}}}$

Then,

1. Pr[X>p+ϵ]≤e ^(−D(p+ϵ∥p)m) and

2. Pr[X<p−ϵ]≤e ^(−D(p-ϵ∥P)m),

where the relative entropy D(x∥y)=x ln

$\frac{x}{y} + \left( {1 - x} \right)$

ln

$\frac{\left( {1 - x} \right)}{\left( {1 - y} \right)}$

and the relative entropy can be lower bounded using the inequality

${D\left( {x{}y} \right)} \geq \frac{\left( {x - y} \right)^{2}}{2{\max\left( {x,y} \right)}}$

for all x, y∈[0,1].

The lower bound on the relative entropy may be used to make the Chernoff bounds effective.

Next, we derive a corollary may be useful for the analysis of the QoPrime algorithm.

Corollary 3.3: The following lower bound holds for the relative entropy for all x, y∈[0,1],

${D\left( {x{}y} \right)} \geq {\frac{\left( {x - y} \right)^{2}}{2{\max\left( {{1 - x},{1 - y}} \right)}}.}$

Proof: the relative entropy is symmetric under the substitution (x, y)→(1−x,1−y), that is D(x∥y)=D((1−x)∥(1−y)). The result follows by applying the inequality

${D\left( {x{}y} \right)} \geq \frac{\left( {x - y} \right)^{2}}{2{\max\left( {x,y} \right)}}$

for (x′, y′)=(1−x,1−y). End of proof.

This disclosure also provides the Multiplicative Chernoff bounds which may be used to compute one or more constants in the QoPrime algorithm.

Theorem 3.4 (Multiplicative Chernoff Bounds): Let X_(i) for i∈[m] be independent random variables such that X_(i)∈[0,1] and let X=Σ_(i∈[m])X_(i). Then, Pr[|X−

[X]|≥β

[X]]≤

for 0<β<1.

3.2 The QoPrime AE Algorithm

An embodiment of the QoPrime AE algorithm is presented as Algorithm 3.1. The implementation of the steps of the algorithm is described in greater detail below and the algorithm is analyzed to establish correctness and bound the running time.

Recall that an amplitude estimation algorithm has access to a quantum circuit U such that U|0^(t))=cos(θ)|x, 0

+sin(θ)|x′, 1

where |x

and |x′

are arbitrary states on (t−1) qubits. Let R₀ be the reflection in |0^(t)

that is R₀|0^(t)

=|0^(t)

and R₀|0^(⊥)

=−|0^(⊥)

(where |0^(⊥)) is any state orthogonal to the |0^(t)

state), and let S₀ be the reflection on |0

in the second register, that is S₀|x, 0

=|x, 0

and S₀|x, 1

=−|x, 1

for all |x

. The QoPrime algorithm uses (2k+1) sequential applications of the circuit for U to create the states,

|ϕ_(k)

:=(UR ₀ U ⁻¹ S ₀)^(k) U|0

=cos((2k+1)θ)|x,0

+sin((2k+1)θ)|x′,1

.  (6)

As previously described, an oracle call is a single application (e.g., execution) of the circuit U on a quantum computing system, the total number of oracle calls (N) is a measure of the running time on the quantum computing system of the amplitude estimation algorithm. Also, the maximum circuit depth (D) is the largest number of sequential calls to U in a single run of the quantum computing system. As stated above, (2k+1) sequential oracle calls creates a copy of the quantum state |ϕ_(k)

.

Some embodiments of the QoPrime algorithm are parametrized by integers (k, q) where k≥2 and 1≤q≤(k−1). The parameter k is the number of moduli used for the reconstruction procedure and q determines the number of moduli that are grouped together. The algorithm includes determining coprime moduli (n₁, n₂, . . . , n_(k)) where each modulus is an integer close to └1/ϵ^(1/k)┘. Let

$\theta = \frac{\pi M}{2N}$

be the true value of θ for M=└M┘+{M} where M∈[0, N] and N=Π_(i∈[k])n_(i). The algorithm includes partitioning the set of the k moduli into groups, π_(i), of size at most q. Let N_(i)=Π_(j∈π) _(i) n_(j) for i∈┌k/q┐ be the product of the moduli in each group.

Some embodiments of the QoPrime algorithm reconstruct estimates M_(i) that are within a (e.g., 0.5) confidence interval around M mod N_(i) with high probability. These estimates may be constructed using N/N_(i) sequential calls to U to prepare the quantum states described by equation 6. Afterwards, the quantum state may be measured (e.g., in the standard basis). These low-precision estimates M_(i) may be combined using the Chinese remainder theorem to determine an ϵ accurate estimation for

$\theta = {\frac{\pi M}{2N}.}$

Some embodiments of the QoPrime algorithm use O(1/ϵ^(1−q/k)) sequential calls to U and a total of O(1/ϵ^(1+q/k)) oracle calls for estimating θ within accuracy ϵ. Similar to the power law algorithm, it trades off the maximum circuit depth against the total number of oracle calls.

Algorithm 3.1 (QoPrime Algorithm for Amplitude Estimation):

Goal: for a unitary gate U that performs U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

and for parameters (k, q), where k≥2 is the number of moduli and 1≤q≤(k−1), estimate of θ within accuracy ϵ with success probability at least p, where 1−2ke^(−2c)>p for constant c.

Step 1: Select coprime odd moduli (n₁, n₂, . . . , n_(k)) close to └1/ϵ^(1/k)┘, let N=Π_(i∈[k])n_(i), and ensure that N≥π/ϵ.

Step 2: Partition [k] into ┌k/q┐ groups π_(i) of size at most q and let N_(i)=Π_(j∈π) _(i) n_(j).

Step 3: If q/k≥⅓ then:

-   -   a. Step 4: Use at most

$\frac{24c}{\epsilon^{1 + {q/k}}}$

depth 0 samples (a depth 0 sample is sampling directly from the oracle) to obtain an additive error

$\frac{1}{2\epsilon^{({1 - {q/k}})}}$

estimate for θ with probability at least 1−e^(−2c).

Step 5: Else:

-   -   a. Step 6: Invoke the QoPrime algorithm recursively with         accuracy

$\epsilon^{\prime} = {{{{O\left( {1/\epsilon^{1 - {q/k}}} \right)}\mspace{14mu}{and}\mspace{14mu}{q^{\prime}/k^{\prime}}\mspace{14mu}{such}\mspace{14mu}{that}\mspace{14mu} 1} + {q/k}} < {1 + {q^{\prime}/k^{\prime}}} < {\frac{\left( {1 + {q/k}} \right)}{\left( {1 - {q/k}} \right)}\mspace{14mu}{to}}}$

-   -   -   obtain an additive error estimate for θ.

Step 7: End if.

Step 8: For i=1 to ┌k/q┐ do:

-   -   a. Step 9: Prepare 100 cN_(i) ² copies of the quantum state         |ϕ_((N-N) _(i) _()/2N) _(i)         (see equation 6) and measure (e.g., in the standard basis) for         each quantum state.     -   b. Step 10: Compute

$\hat{l} = \frac{2N_{i}}{\pi}$

-   -   arccos(√{square root over ({circumflex over (p)})}) where         {circumflex over (p)} is the observed probability of outcome 0         of a predetermined qubit.     -   c. Step 11: Define

$\theta = \frac{\pi M}{2N}$

for M=tN_(i)+l. Then determine t from the additive error N/2N_(i) estimate computed in steps 4-8 of the algorithm.

-   -   d. Step 12: Compute M_(i) =(−1)^(t){circumflex over (l)} mod         N_(i), where M_(i) is an estimate of M.

Step 13: end for.

Step 14: Define α to be the number in the interval I=∩_(i)([M_(i) −0.25,M_(i) +0.25] mod 1).

Step 15: Compute M_(i)=└M_(i) +β_(i)┘ where β_(i)∈[−0.25,0.25] is such that {M_(i) +β_(i)}=α.

Step 16: Reconstruct M mod N applying the Chinese Remainder Theorem to the computed M_(i) values.

Step 17: Determine

$\frac{\pi\left( {\overset{\_}{M} + \alpha} \right)}{2N}$

as estimate for θ.

End of algorithm 3.1.

The procedures used in steps 4-12 of the QoPrime Algorithm 3.1 are described next and correctness proofs are provided for each step. Let M=tN_(i)+l for some t∈

and 0≤l≤N_(i). Note that l=M mod N_(i) is the value being estimated in step 5 of the QoPrime algorithm.

Step 9 of the QoPrime algorithm 3.1 samples from a Bernoulli random variable with success probability

$p = {{\cos^{2}\left( \frac{\left( {{tN}_{i} + l} \right)\pi}{2N_{i}} \right)}.}$

Step 10 computes an estimate

$\hat{l} = \frac{2N_{i}}{\pi}$

arccos(√{square root over ({circumflex over (p)})}) where {circumflex over (p)} is the observed probability of outcome 0. The analysis below shows that |{circumflex over (l)}−(−1)^(t)l mod N_(i)|≤0.25 with high probability.

We begin with the observation that if p={circumflex over (p)} then {circumflex over (l)}=(−1)^(t)l mod N_(i) (p represents the real value and ={circumflex over (p)} represents the observed value).

$\begin{matrix} {{\frac{2N_{i}}{\pi}{\arccos\left( \sqrt{p} \right)}} = \left\{ {\begin{matrix} {{\frac{2N_{i}}{\pi}{\arccos\left( {\cos\left( \frac{l\pi}{2N_{i}} \right)} \right)}},} & {{{if}\mspace{14mu} t} = {0\mspace{14mu}{mod}\mspace{14mu} 2}} \\ {{\frac{2N_{i}}{\pi}{\arccos\left( {\sin\left( \frac{l\pi}{2N_{i}} \right)} \right)}},} & {{{if}\mspace{14mu} t} = {1\mspace{14mu}{mod}\mspace{14mu} 2}} \end{matrix} = {\left( {- 1} \right)^{t}l\mspace{14mu}{mod}\mspace{14mu}{N_{i}.}}} \right.} & (7) \end{matrix}$

If

$\theta = \frac{\pi M}{2N}$

for M=tN_(i)+1, then the estimator in step 10 estimates (−1)^(t)l mod N_(i) as shown in Lemma 3.6 (below). The Chinese remainder theorem based reconstruction procedure (step 16) however uses (e.g., requires) l mod N_(i) for all N_(i) making it useful (e.g., necessary) to determine the value of t mod 2. In steps 3-7 of the algorithm, an additive error N_(i)/2 for estimate M=tN_(i)+1 is determined by recursively using the QoPrime algorithm. The value of t is then determined in step 11 by rounding the estimate to the closest integer of the form tN_(i)+N_(i)/2 for 0≤t<N/N_(i). The next lemma establishes the correctness of this recursive procedure.

Lemma 3.5: The recursive procedure in steps 3-7 of the QoPrime algorithm terminates in at most O(log k) iterations and outputs an additive error N_(i)/2N estimate for θ with probability at least 1−9e^(−2c).

Proof: First we establish the correctness of the stopping condition in step 4. If q/k≥⅓, then

$\frac{\left( {1 + {q\text{/}k}} \right)}{2} \geq {\left( {1 - {q\text{/}k}} \right).}$

Thus, by the multiplicative Chernoff bounds, an additive error

$\frac{1}{2\epsilon^{({1 - {q/k}})}}$

estimate for θ is obtained with

$\frac{24c}{\epsilon^{({1 + {q/k}})}}$

depth 0 samples with probability at least 1−e^(2c) for a constant c>0.

Next, we show that the recursion terminates in at most O(log k) steps and that the total number of oracle calls used in the recursive step is upper bounded by O(1/ϵ^(1+q/k)).

The recursive call to the QoPrime algorithm in step 6 uses O(1/ϵ′^((1+q′/k′)))=O(1/ϵ^((1−q/k)(1+q′/k′))) total oracle calls. As (1−q/k)(1+q′/k′)<(1+q/k) the total number of oracle calls used by steps 3-7 may be at most O(1/ϵ^(1+q/k)). The extra oracle calls used for these steps may not change the asymptotic number of oracle calls used by the QoPrime algorithm. Further, because 1+q/k<1+2q/k<(1+q/k)/(1−q/k) it is possible to choose q′=2q, k′=k and with this choice the stopping condition q/k≥⅓ in step 4 will hold after a number of iterations (e.g., at most O(log k) iterations).

The success probability for these steps is similar to (e.g., the same as) the success probability for the final recursive call to the Qo-prime algorithm in step 6, which uses at most ┌k/q┐≤3 moduli by the stopping condition. For this final recursive call, the signs are determined correctly with probability at least 1−3e^(−2c) for the union bound. Further, the argument in Theorem 3.7 shows that the QoPrime algorithm succeeds with probability at least 1−6e^(−2c) in a setting where the t have been determined correctly and there is at most 3 moduli, implying that the steps 3-7 succeed with probability at least 1−9e^(−2c). End of proof for Lemma 3.5.

The following Lemma quantifies the error possibly made in approximating (−1)^(t)l mod N_(i) using the Chernoff bounds to bound the difference between {circumflex over (p)} and p.

Lemma 3.6: Given m=100 cN² samples for modulus N, steps 9-10 of the QoPrime algorithm finds an estimate such that |{circumflex over (l)}−(−1)^(t)l mod N_(i)|≤0.25 with probability at least 1−2e^(−2c).

Proof: Define F: [0,1]→[0, N] as

${F(p)} = \frac{2N}{\pi}$

arccos(√{square root over (p)}) so that F(p)=(−1)^(t)l mod N by equation 7 and F({circumflex over (p)})={circumflex over (l)}. It is sufficient to show that Pr[|F(p)−F({circumflex over (p)})|≥0.25]≤e^(−2c) for all p∈[0,1].

The inverse function G: [0, N]→[0,1] such that

${G(y)} = {\cos^{2}\left( \frac{\pi y}{2N} \right)}$

is monotonically decreasing. Let F(p)=y, then F(p)−F({circumflex over (p)})≥0.25 is equivalent to (y−0.25)≥F({circumflex over (p)}), that is G(y−0.25)<{circumflex over (p)}. Applying the additive Chernoff bound results in

Pr[{circumflex over (p)}>G(y−0.25)]≤e ^(−mD(G(y-0.25)∥G(y))).  (8)

The analysis splits into two cases, where the relative entropy is lower bounded using either the inequality in Theorem 3.2 or Corollary 3.3. Let

$A = \frac{\pi y}{2N}$ and $B = {\frac{\pi\left( {y - 0.25} \right)}{2N}.}$

First consider the case y>N/2 or equivalently A>π/4,

$\begin{matrix} {{{{mD}\left( {{G\left( {y - 0.25} \right)}{}{G(y)}} \right)} \geq {m\frac{\left( {{\cos^{2}(A)} - {\cos^{2}(B)}} \right)^{2}}{2{\cos^{2}(B)}}}} = {{m\frac{\left( {{\cos^{2}\left( {2A} \right)} - {\cos^{2}\left( {2B} \right)}} \right)^{2}}{8{\cos^{2}(B)}}} = {{m\frac{{\sin^{2}\left( {A + B} \right)}{\sin^{2}\left( {A - B} \right)}}{2{\cos^{2}(B)}}} = {{m\frac{\sin^{2}\left( {A - B} \right)}{2}\left( {{\sin(A)} + {{\cos(A)}{\tan(B)}}} \right)^{2}} \geq {50{cN}^{2}{\sin^{2}\left( \frac{\pi}{8N} \right)}} \geq {50{cN}^{2}\frac{\pi^{2}}{128N^{2}}} > {2{c.}}}}}} & (9) \end{matrix}$

Note that for the last two steps in the computation, the inequality (sin(A)+cos(A)tan(B))²>1 for A>π/4 and

${A - B} = \frac{\pi}{8N}$

is used along with the lower bound sin(x)≥x/2 for x∈[0, π/2].

For the case y<N/2 a similar computation is performed using Corollary 3.3 to lower bound the relative entropy,

$\begin{matrix} {{{{mD}\left( {{G\left( {y - 0.25} \right)}{}{G(y)}} \right)} \geq {m\frac{\left( {{\cos^{2}(A)} - {\cos^{2}(B)}} \right)^{2}}{2{\cos^{2}(B)}}}} = {{m\frac{{\sin^{2}\left( {A + B} \right)}{\sin^{2}\left( {A - B} \right)}}{2{\cos^{2}(B)}}} = {{m\frac{\sin^{2}\left( {A - B} \right)}{2}\left( {{\cos(B)} + {{\sin(B)}{\cot(A)}}} \right)^{2}} \geq {50{cN}^{2}{\sin^{2}\left( \frac{\pi}{8N} \right)}} > {2{c.}}}}} & (10) \end{matrix}$

For the last steps in the computation, the inequality (cos(B)+sin(B)cot(A))²>1 for A<π/4 and

${A - B} = \frac{\pi}{8N}$

was used along with the lower bound sin(x)>x/2 for x∈[0,π/2].

Similarly, F({circumflex over (p)})−F(p)≥0.25 is equivalent to G(y+0.25)>{circumflex over (p)} and the probability can be bounded using the additive Chernoff bound,

Pr[{circumflex over (p)}<G(y+0.25)]≤e ^(−mD(G(y+)0.25)∥G(y)).  (11)

Further, mD(G(y+0.25)∥G(y))≥2c for all y∈[0, N] with probability at least 1−e^(−2c). This follows from calculations similar to the ones in equations 9 and 10 with denominators (cos²(B), sin²(A)) replaced by (cos²(A), sin²(B)). From the union bound in probability theory, it follows that |{circumflex over (l)}−(−1)^(t)l mod N|≤0.25 with probability at least 1−2e ^(−2c). End of proof for Lemma 3.6.

The procedure in steps 14-16 of the QoPrime algorithm 3.1 is now described for estimating the values M_(i) that are later used in the Chinese Remainder theorem. Define the confidence intervals A_(i)=[{M_(i) }−0.25, {M_(i) }+0.25] corresponding to the estimates {M_(i) } produced by the QoPrime algorithm. Let I=∩_(i)A_(i) be the intersection of the A_(i). Applying the union bound and Lemma 3.6 it follows that I is non empty and the fractional part {M}∈I with probability at least 1−2ke^(−c). Step 14 of the QoPrime algorithm is therefore able to determine α∈I. In Step 15, from α, β_(i)∈[−0.25,0.25] is determined such that {M_(i) +β_(i)}=α and then the value M_(i) is computed as M_(i)=└M_(i) +β_(i)┘. Next, we show that using the Chinese Remainder Theorem on these values produces an estimate with error E with high probability.

Theorem 3.7: The estimate output by the QoPrime algorithm is within additive error ϵ of the true value with probability at least 1−11ke^(−2c).

Proof: For m_(i)∈

/

_(N) _(i) , let CRT(m₁, . . . , m_(k)) denote the unique integer m mod N such that m=m_(i) mod N_(i) (note that CRT stands for Chinese Remainder Theorem). The Chinese remainder theorem shows that this function is invertible, specifically CRT⁻¹(m)=(m mod n₁, . . . , m mod n_(k)). The CRT function is continuous in the following sense CRT⁻¹(m+a)=(m+a mod n₁, . . . , m+a mod n_(k)), or equivalently CRT (m₁+a, m₂+a, . . . , m_(k)+a)=CRT(m₁, . . . , m_(k))+a for a∈

. For M=└M┘+{M}, we have

M=CRT(└M┘ mod N ₁ , . . . ,└M┘ mod N _([k/q]))+{M}.  (12)

The QoPrime algorithm instead outputs the reconstructed estimate,

M=CRT(└ M ₁ +β₁┘, . . . ,└ M _([k/q]) +β_(i)┘)+α.  (13)

Assuming that the t were determined correctly, by Lemma 3.6 and the choice of β_(i) in step 15 of the QoPrime Algorithm, |M_(i) +β_(i)−(└M┘ mod N_(i)+{M})|=γ<0.5 for all i where γ is independent of i with probability at least 1−2ke^(−2c). This statement suffices for the analysis in Lemma 3.5, which in turn shows that the t are indeed determined correctly with probability 1−9e^(−2c). The probability for the moduli and estimates to be correct is 1−11ke^(−2c) by the union bound.

The accuracy of the estimates for the moduli implies that |└M_(i) +β_(i)┘−(└M┘ mod N_(i))|≤1. By continuity of the Chinese remainder theorem, the reconstruction error |M−M|≤1+|α−{M}|≤1.5. The QoPrime algorithm therefore outputs an estimate

$\frac{\pi\overset{\_}{M}}{2N}$

that differs from the true value

$\frac{\pi M}{2N}$

by at most

$\frac{\pi}{N} \leq {\epsilon.}$

End of proof for Theorem 3.7.

3.3 Choosing the Parameters

In this section, we further detail determining the parameters for the QoPrime algorithm. The small-ϵ asymptotics of the QoPrime algorithm are based on selecting coprime moduli of similar magnitude and product of order Θ(ϵ⁻¹). To this effect, we formulate the following lemma:

Lemma 3.8: Given a fixed integer k≥2 and N∈

, we can find k mutually coprime integers n₁(N)<n₂(N)< . . . <n_(k)(N) such that

${\lim_{N->\infty}\frac{{n_{1}(N)}\mspace{14mu}\ldots\mspace{14mu}{n_{k}(N)}}{N}} = 1$ and ${\lim_{N->\infty}\frac{n_{k}(N)}{n_{1}(N)}} = 1.$

This lemma follows from the sub-linear scaling of the number of coprimes that can fit inside an interval of a given size. In practice, a table of k adjacent odd coprimes may be pre-computed. The computed k adjacent odd coprimes start at each odd integer, stop at large enough values of k and n₁ according to the target precision ϵ. It suffices to build the table for k≤12 and n₁≈10⁵ to achieve approximation error ϵ=10⁻¹⁰, both with and without noise. Given the table, target precision ϵ, and an integer k≥2, the implementation chooses k adjacent coprimes from the table starting at └ϵ^(1/k)┘ with product closest to

$\frac{\pi}{\epsilon}$

in absolute value. FIG. 2 compares the table used in practice to the theoretical guarantees in Lemma 3.8.

Specifically, FIG. 2 illustrates a plot of the convergence of the coprime finding routine described above. The output of the routine is k adjacent coprimes n₁, n₂, . . . , n_(k) such that their product N=n₁ . . . n_(k) is close to π/ϵ. Both the smallest coprime n₁ (approaching 1 from below) and the largest coprime n_(k) (approaching 1 from above) are shown to converge to (π/ϵ)^(1/k) matching the convergence in Lemma 3.8 for several values of k.

We can therefore summarize the asymptotics of the noiseless QoPrime algorithm for parameters ϵ, k, q as follows:

Coprimes n₁, . . . , n_(k) =Θ(1/ϵ^(1/k)) Max Sampling Depth $\max_{i}\frac{N}{N_{i}}$ =Θ(1/ϵ^(1−q/k)) Total Oracle Calls $\Theta\left( {\sum\limits_{i = 1}^{\lceil{k/q}\rceil}{N_{i}^{2} \times \frac{N}{N_{i}}}} \right)$ ${\Theta\left( {\left\lceil \frac{k}{q} \right\rceil{1/\epsilon^{1 + {q/k}}}} \right)}(14)$

The QoPrime algorithm analysis from above indicates that, for a given target precision ϵ and overall failure probability δ, the total number of oracle calls for some embodiments of the Qoprime algorithm may scale as:

$\begin{matrix} {{\mathcal{N}\left( {k,q,\epsilon,\delta} \right)} = {C \times \left\lceil \frac{k}{q} \right\rceil \times \frac{1}{\epsilon^{1 + {q/k}}} \times {\log\left( {\frac{4}{\delta}\left\lceil \frac{k}{q} \right\rceil} \right)}}} & (15) \end{matrix}$

The constant C does not depend on the parameters of the algorithm. In fact, experimental results indicate that C is a small constant and, in practice, we can expect C<10. (e.g., see FIG. 4).

In practice, given target precision E and accepted failure probability δ, improved (e.g., optimal) values of k and q are determined such that the number of oracle calls in equation 15 is decreased (e.g., minimized). For example, in this noiseless scenario, the preferred (e.g., optimal) parameter q is 1 (indicating that it is preferred to access the largest allowed depth). Reducing (e.g., minimizing) the oracle call number in equation 15 by choosing k leads to (ignoring the subleading contribution from the failure probability δ):

$\begin{matrix} {{k^{*}(\epsilon)} \approx {\log\frac{1}{\epsilon}}} & (16) \end{matrix}$

Plugging this back into equation 15 leads to an asymptotic dependency of oracle calls of 1/ϵ, up to logarithmic factors, as expected in the quantum regime:

$\begin{matrix} {{\lim\limits_{\epsilon->0}{\mathcal{N}\left( {\epsilon,\delta} \right)}} = {C \times e \times \frac{1}{\epsilon}\log\frac{1}{\epsilon}{\log\left( \frac{4\log\frac{1}{\epsilon}}{\delta} \right)}}} & (17) \end{matrix}$

3.4 QoPrime Algorithm with Noise

In this section, we study the performance of the QoPrime AE algorithm under the same depolarizing noise model as we studied for the power law AE algorithm. Inverting the noisy probabilistic model described in equation 2, θ can be computed as,

2nθ=±arccos [e ^(γn)(1−2p)] mod 2π  (18)

As before, we can use this relation to translate a confidence interval on the probability p (say, ϵ_(p)), computed by (e.g., classical) postprocessing of measurement samples, to a confidence interval on the angle θ, denoted ϵ_(θ):

$\begin{matrix} {\epsilon_{\theta} \leq {\frac{e^{\gamma n}}{n} \times \sup\frac{d\arccos x}{dx} \times \epsilon_{p}}} & (19) \end{matrix}$

A difference in the noisy case is the exponential stretch factor of e^(γn) enhancing the angle confidence interval. Since a classical confidence interval shrinks as the square root of the number of samples, the number of samples will pick up a factor of e^(2γn) under this noise model (e.g., in order to guarantee the noiseless confidence intervals). This provides the proof of the noisy version of Lemma 3.6:

Lemma 3.9: If l∈[N_(i)/6, 5N_(i)/6] and γ≥0 is the depolarizing noise rate per oracle call, then given

$m = {100cN_{i}^{2}e^{2\gamma\frac{N - N_{i}}{2N_{i}}}}$

samples, steps 4-5 of the QoPrime AE algorithm determine an estimate such that |{circumflex over (l)}−(−1)^(t)l mod N_(i)|≤0.25 with probability at least 1−2e^(−c).

Similar to the noiseless algorithm (see (14)), we summarize the asymptotics of the noisy algorithm for parameters ϵ, k, q, γ as follows:

Coprimes n₁, . . . , n_(k) =Θ(1/ϵ^(1/k)) Sampling Depth $\max_{i}\frac{N}{N_{i}}$ =Θ(1/ϵ^(1−q/k)) Oracle Calls $\Theta\left( {\sum\limits_{i = 1}^{\lceil{k/q}\rceil}{N_{i}^{2}e^{2\gamma\;{N/N_{i}}}\frac{N}{N_{i}}}} \right)$ $= {{\Theta\left( {\left\lceil \frac{k}{q} \right\rceil\frac{1}{\epsilon^{1 + {q/k}}}e^{2{\gamma{(\frac{\pi}{\epsilon})}}^{1 - {q/k}}}} \right)}(20)}$

Therefore, for a given target precision ϵ, depolarizing noise level γ, and overall failure probability δ, the total number of oracle calls may scale as:

$\begin{matrix} {{\mathcal{N}\left( {k,q,\gamma,\epsilon,\delta} \right)} = {C \times \left\lceil \frac{k}{q} \right\rceil \times \frac{1}{\epsilon^{1 + {q/k}}} \times {\exp\left( {2{\gamma\left( \frac{\pi}{\epsilon} \right)}^{1 - {q/k}}} \right)} \times {\log\left( {\frac{4}{\delta}\left\lceil \frac{k}{q} \right\rceil} \right)}}} & (21) \end{matrix}$

where C is the same constant overhead as in equation 15. Similar to the power law AE algorithm, in practice, given target precision ϵ, noise level γ, and accepted failure probability δ, we can determine improved (e.g., optimal) values of k and q such that the number of oracle calls in equation 21 is reduced (e.g., minimized). See also FIG. 3 for the behavior of the number of oracle calls for different values of k and q.

More specifically, FIG. 3 illustrates a plot of the behavior of the oracle call dependency in equation 21 for several values of parameters k and q, and for fixed noise level γ=10⁻⁵ and probability of failure δ=10⁻⁵. When taking the minimum over the family of curves parametrized by all valid k and q (here assumed continuous for simplicity), we obtain the envelope of the improved (e.g., optimal) QoPrime algorithm (labeled “Optimized Over k, q”)). This emergent behavior smoothly interpolates between a classical 1/ϵ² scaling in the noise-dominated region ϵ«γ and a quantum scaling 1/ϵ in the coherent region ϵ«γ.

Determining (e.g., optimizing) over k and q in this manner may result in the effective scaling of oracle calls as a function of E being (e.g., always) between the classical scaling of 1/ϵ² and the quantum scaling 1/ϵ (see FIG. 4). Specifically, this can be formulated as a bound on the instantaneous exponent:

$\begin{matrix} {{- 2} \leq {\lim\limits_{\epsilon\rightarrow 0}\frac{d\inf_{k,q}\log\;{\mathcal{N}\left( {\epsilon,\gamma,\delta,k,q} \right)}}{d\;\log\;\epsilon}} \leq {- 1}} & (22) \end{matrix}$

FIG. 4 illustrates a plot of the transition from coherent to noise dominated as measured by the instantaneous exponent

$\frac{d\;\log\;\mathcal{N}}{d\;\log\;\epsilon},$

where

is the number oracle calls optimized over parameters k and q.

In the noise-dominated limit, the improved (e.g., optimal) parameter q tends to its upper bound q=k−1 (corresponding to the shallowest accessible circuits). Using this observation, we can analytically study the optimization over k using the dependency in equation 21 and obtain that the improved (e.g., optimal) k parameter may have the form (in a continuous approximation):

$\begin{matrix} {{k^{*}\left( {\epsilon,\gamma} \right)} \approx \frac{\log\frac{\pi}{\epsilon}}{\log\frac{\log\frac{1}{\epsilon}}{2\gamma\log\frac{\pi}{\epsilon}}}} & (23) \end{matrix}$

This allows us to study the asymptotic dependency of oracle calls on the target precision ϵ analytically. Extracting the low-ϵ limit yields:

$\begin{matrix} {{{\lim\limits_{{\epsilon\rightarrow 0},{\gamma ⪢ \epsilon}}{\mathcal{N}\left( {\epsilon,\gamma,\delta} \right)}} = {2C \times 2\gamma\; e \times {\log\left( \frac{8}{\delta} \right)} \times \frac{1}{\epsilon^{2}}}},} & (24) \end{matrix}$

where C is the constant prefactor introduced in equation 21. This classical-limit curve can be used to compare the asymptotic runtime of the Qoprime algorithm to classical Monte Carlo techniques.

Outside of the two noise limits, specifically noise-dominated equation 24 and noiseless equation 17, the improved (e.g., optimal) parameters k and q may depend non-trivially on the problem, and they may be determined numerically. Example (k, q) improved (e.g., optimal) trajectories obtained by optimizing equation 21 are shown in FIG. 5.

FIG. 5 illustrates a plot of the trajectory of optimal estimates for k, q parameters chosen by reducing (e.g., minimizing) the asymptotic dependency in equation 21, for two different noise levels (γ=10⁻⁴ and γ=10⁻⁸). The optimization estimate is over continuous k, q for simplicity. The green region marks the valid parameter region k≥2, 1≤q≤k−1. The arrows show the direction of the parameters as the target precision ϵ is being lowered from ϵ=10⁻³ to ϵ=10⁻¹⁰. While for ϵ«γ (noiseless regime), we have that q=1.

4. Empirical Results

In this section, we present empirical results for the power law and the QoPrime AE algorithms and compare them with other amplitude estimation algorithms. The experimental results validate the theoretical analysis and provide further insight in the behavior of these low depth algorithms in noisy regimes.

4.1 The Power Law AE

FIG. 6 compares the theoretical and empirically observed scaling for the number of oracle calls N as a function of the error rate ϵ in the power law AE algorithm in the absence of noise, i.e., γ=0. We numerically simulated the power law AE algorithm for randomly chosen θ∈[0, π/2] and with

$m_{k} = \left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$

for fixed values of parameter β∈{0.455, 0.714}, which make the exponent be {0.2,0.6} respectively. We also provide the extremal cases of β∈{0,1}. The experimental results agree closely with the predictions of the theoretical analysis.

More specifically FIG. 6 illustrates a plot of the performance of the power law AE algorithm in theory (solid lines) and practice (dots and triangles) using the schedule

$m_{k} = \left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$

for values β=0.455 and β=0.714, for different error rates ϵ. The true value is θ* is chosen at random. We took the number of shots N_(shot)=100. Applying linear regression to these experimental data points, gives slopes −1.718 and −1.469, whereas the theoretical slopes are −1.714 and −1.455 for the circle and triangle points respectively.

FIG. 7 shows the scaling of the power law AE algorithm under several noise levels. Here, the parameter β is chosen adaptively, based on the error rate ϵ and noise level γ. For target errors below the noise level, we can use the exponential schedule to get the improved (e.g., optimal) quantum scaling. After this threshold, we use the power law schedules with exponents chosen as in Proposition 2.5. The result is that for these smaller target errors, the scaling is in between the improved (e.g., optimal) quantum and the classical scaling.

More specifically, FIG. 7 illustrates a plot of the performance of the power law AE algorithm in theory (solid lines) and practice (dots, triangles, and squares) using power law schedules where the parameter β is improved (e.g., optimized) for given ϵ and γ. Also, the classical and quantum scalings are plotted for comparison. For small target errors, we obtain the improved (e.g., optimal) quantum scaling, while for smaller target errors we use power law exponents using Proposition 2.5. The result is that the scaling approaches the classical scaling as the target error goes to 0.

4.2 The QoPrime Algorithm

The parameters k and q for the QoPrime algorithm may be determined improving (e.g., optimizing) over the Chernoff upper-bounds obtained in Lemma 3.9 and described in equation 20. FIG. 8 shows the theoretical upper bounds (solid lines) and the empirically observed number of oracle calls (dots, triangles, and squares) as a function of the accuracy E for different noise rates (γ). The algorithm in practice (dots, triangles, and squares) performs better than the theoretical bounds (lines) because it computes the confidence intervals using exact binomial distributions as opposed to the Chernoff bounds in the theoretical analysis.

More specifically, FIG. 8 illustrates a plot of the performance of the QoPrime algorithm under several noise levels (γ). Shown are both theoretical upper bounds (solid lines) and the simulated oracle calls (dots, triangles, and squares) for three scenarios: noiseless (circles), depolarizing rate γ=10⁻⁷ (squares), and depolarizing rate γ=10⁻⁵ (triangles). The classical Monte Carlo curve (dashed line) is obtained by assuming noiseless classical sampling from a constant oracle depth of 1. We see the curve follows a quantum ϵ⁻¹ scaling for small errors ϵ«γ, which transitions into a classical ϵ⁻² dependency when the precision is much smaller than the noise level (ϵ<<γ).

FIG. 9 plots the maximum oracle depth as a function of the target precision ϵ for the QoPrime algorithm in noiseless (γ=0) and noisy settings (γ≠0), as well as for the IQAE algorithm.

More specifically, FIG. 9 illustrates a plot of the maximum oracle depth as a function of the target precision ϵ. The noiseless QoPrime algorithm and the IQAE algorithm in both have a similar scaling of depth as O(ϵ⁻¹). However, introducing a depolarizing noise level γ provides a bound for the depth used by the QoPrime algorithm. Specifically, the plot suggests that some embodiments QoPrime algorithm generally cannot access circuit depths higher than the noise scale γ, which corresponds to exponentially suppressed confidence intervals, and the algorithm may require exponentially more samples to produce a good estimate.

FIG. 10 provides empirical estimates for the constant factor C for the QoPrime algorithm in noisy settings. The observed value of C is a small constant and the simulations show that C<10 over a wide range of E and noise rates that cover most settings of interest. More specifically, FIG. 10 illustrates a plot of empirical values of the constant prefactor C, as described in Equation (21) above, on the target precision ϵ for an arbitrary value of the true angle θ and failure probability δ=10⁻⁵.

4.3 Benchmarking

This section compares the performance of the Power law and QoPrime AE algorithms against the amplitude estimation algorithm referred to as IQAE (Iterative Quantum Amplitude Estimation algorithm) and described in D. Grinko, J. Gacon, C. Zoufal, and S. Woerner, “Iterative quantum amplitude estimation,” arXiv preprint arXiv:1912.05559, 2019.

FIG. 11 plots the performance of the Power Law, the QoPrime, and the IQAE for the noise level of γ=10⁻³. The performance is measured by the total number of oracle calls for target accuracy ϵ. The plot emphasizes the advantage of the Power law and the QoPrime over algorithms such as IQAE (IQAE requires access to a circuit of depth of O(1/ϵ). In this scenario, the large depth of the IQAE is exponentially penalized by the depolarizing noise. Thus, IQAE requires an exponentially large number of classical samples to achieve a precision below the noise level. In comparison, the power law and the QoPrime AE algorithms transition smoothly to a classical estimation scaling (said differently, they become generally parallel with the classical scaling) and do not suffer from an exponential growth in oracle calls. In these simulation results, the Power law AE algorithm has the best practical performance (because it uses less oracle calls than QoPrime).

5. Regularity Conditions for Bernstein Von-Mises Theorem

We enumerate the regularity conditions for the Bernstein Von Mises theorem (Provided in section 2.1) for power law schedules. Let us consider a schedule where k oracle calls in series are made N_(k) times, where measurements (e.g., in the standard basis) and Bayesian updates are performed after each of the N_(k) shots. The variable N_(k) ₀ and N_(k) ₁ represent the number of times outcomes 0 and 1 are observed in the N_(k) measurements. The probability density function and the log likelihood may be given by,

$\begin{matrix} {\mspace{79mu}{{f\left( {X,\theta} \right)} = {\frac{1}{Z}{\prod_{k}{{\cos^{2}\left( {\left( {{2k} + 1} \right)\theta} \right)}^{N_{k_{0}}}{\sin^{2}\left( {\left( {{2k} + 1} \right)\theta} \right)}^{N_{k_{1}}}}}}}} & (25) \\ {{l\left( {X,\theta} \right)} = {{\sum_{k}{2N_{k_{0}}\log{\cos\left( {\left( {{2k} + 1} \right)\theta} \right)}}} + {2N_{k_{1}}\log{\sin\left( {\left( {{2k} + 1} \right)\theta} \right)}} + {\log\; Z}}} & (26) \end{matrix}$

The regularity conditions for the Bernstein Von-Mises Theorem state that there is a suitable domain Θ such that the following statements hold for all possible values of the measurement outcomes X and for some integer s≥2:

Statement 1: ƒ(X, θ) is continuous on Θ.

Statement 2: l(X, θ) is continuous on Θ.

Statement 3: For every θ∈Θ there is an open neighborhood U_(θ) such that for σ, τ∈U we have E_(σ)(l(X,τ)^(s)) is bounded. (More precisely, the supremum of E_(σ)(l(X,τ)^(s)) is finite).

Statement 4: For every (0, τ)∈(Θ,Θ) there exist neighborhoods U and V of θ,τ (these neighborhoods depend on θ and τ) such that the supremum E_(σ)|inf_(δ∈V)l(X, δ)|^(s) over all σ∈U is finite.

Statement 5: l(X, θ) is twice differentiable on Θ.

Statement 6: For all θ∈Θ there is a neighborhood U_(θ) such that for all τ∈U_(θ),

0<E _(τ)[l″(X,τ)^(s)]<∞  (27)

This is the s-th moment of the Fisher information.

Statement 7: There are neighborhoods U for all θ and a bounded function k_(θ): X→

such that,

∥{l″(X,τ)−l″(X,σ)∥≤∥τ−σ∥k _(θ)(X)  (28)

for all τ,σ∈U.

Statement 8: The prior probability λ is positive on Θ and 0 on R\Θ.

Statement 9: For every θ∈Θ there is a neighborhood U_(Θ) such that for all σ,τ∈U_(θ) and constant c_(θ)>0 such that,

|log λ(σ)−log λ(τ)|≤∥σ−τ∥c _(θ)  (29)

This is equivalent to the continuity of λ′ on Θ.

These regularity conditions above can be sub-divided into three groups as follows:

Statements 1-4 are about the smoothness of ƒ(X, θ) and l(X, θ), they may be satisfied if the norm of log-likelihood is bounded on Θ. We can choose Θ to be a subinterval around the true value for which the log-likelihood is bounded.

Statements 5-7 are about the smoothness of the Fisher information on Θ. They assert that the Fisher information is bounded on Θ and is differentiable, this means that the log-likelihood function should have derivatives of order at least 3.

Statements 8-9 are about the smoothness of the prior distribution, namely that the first derivative of the prior should be a continuous function. These are trivially true for the uniform distribution.

Referring back to FIG. 1, the graph illustrates that the log-likelihood function is smooth over a neighborhood of the true value indicating that the regularity conditions for the Bernstein-Von Mises theorem are plausible in this setting, for a large neighborhood Θ to be around the true value. Algorithm 2.1 is stated with Θ as the entire [0,π/2] interval as this choice seems to work in practice, one can also imagine a slightly modified algorithm where the first few sampling rounds are used to get a rough estimate for the true value lying in a large interval Θ and for subsequent rounds the prior is uniform on Θ, with convergence established using the Bernstein Von-Mises theorem. Adding noise may further regularize the log-likelihood functions and enforce the regularity conditions of the Bernstein-Von Mises theorem.

6. Example Methods

6.1 Example Method of Power Law Algorithm

FIG. 12 is a flow chart illustrating a first method of operating a quantum computing system to determine θ of a unitary operator U within an error ϵ, where U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

, according to one embodiment. The method may be performed with success probability at least p. The steps of method may be performed in different orders, and the method may include different, additional, or fewer steps. The method is described as being performed by a computing system that, for example, includes a classical computing system and quantum computing system. The computing system may perform the operations of the method by executing instructions stored on a non-transitory computer readable storage medium. The method in FIG. 12 may be considered an embodiment of the Power Law algorithm. For example, the steps may be similar to the steps in Algorithm 2.1.

The computing system (e.g., the classical computing system) determines 1205 a parameter β∈(0,1) (exclusive of 0 and 1). The parameter β determines how quickly the circuit depth grows (circuit depth is given by m_(k)) for growing values of k (further described below). Viewed differently, parameter β determines the tradeoff between circuit depth and speedup. β may be determined based on the noise of the computing system (e.g., the quantum computing system). In some embodiments, the parameter β may be determined using the techniques described in Section 4.

The computing system (e.g., the classical computing system) determines 1210 an initial probability distribution p(θ) for a set of angles θ. The initial distribution may be a prior probability distribution (referred to as a prior). A prior is a probability distribution that reflects an initial estimate or guess before additional information is accounted for. In some embodiments, the distribution is a (e.g., uniform) distribution for angles described by

${\theta = \frac{\pi\; t\;\epsilon}{2}}.$

One may think of this as discretizing the possible angles (e.g., [0,π/2]) into a number of bins (e.g., 1/ϵ) of size τϵ/2. The variable t may then be an integer greater than zero that numbers those bins (e.g., t∈[0,1/ϵ]). Note that t is not related to the superscript t in |0^(t)).

For k=1 to K, steps 1215-1235 are performed in a loop. Generally, K is selected to be as big as possible given the circuit depth capabilities of the quantum computing system. In some embodiments, K is defined as

${K = {\max\left( {\frac{1}{\epsilon^{2\beta}},{\log\;\left( {1/\epsilon} \right)}} \right)}},$

which means that the value of K is the larger of

$\frac{1}{\epsilon^{2\beta}}$

and log(1/ϵ). In these embodiments, this results in the method following the exponential schedule as β→0.

The computing system (e.g., the classical computing system) initializes 1215 N_(k0)=0 and N_(k1)=0.

For i=1 to N_(shots), steps 1220-1230 are performed in a loop. N_(shots) is an integer that specifies the number of shots taken at each value of k. Said differently, N_(shots) specifies the number of samples taken at a circuit depth specified by m_(k). N_(shots) may be determined during step 1205. If it is desirable to reduce (e.g., minimize) the total number of oracle calls, it may be desirable to reduce N_(shots) (since N_(shots) affects the total number of oracle calls).

The quantum computing system sequentially executes 1220 the unitary operator

$\left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$

times. Said differently, the quantum computing system executes a quantum circuit that includes the unitary operator

$\left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$

times in a sequence. This creates a quantum state on the quantum computing system.

The quantum computing system measures 1225 a qubit of the resulting quantum state (e.g., in a standard basis). The qubit may be the last qubit (as previously described). More generally, any marked state may be measured (e.g., corresponding to more than one qubit). For example, the quantum computing system measures whether two qubits are both 0.

The computing system (e.g., the classical computing system) updates 1230 the value of N_(k0) or N_(k1) based on the measurement. N_(k0) indicates the number of times a value of 0 is measured and N_(k1) indicates the number of times a value of 1 is measured. For example, responsive to a value of the qubit being 0, the value of N_(k0) is updated to be N_(k0)+1, and responsive to the value of the qubit being 1, the value of N_(k1) is updated to be N_(k1)+1.

The computing system (e.g., the classical computing system) updates 1235 the probability distribution based on the updated values of N_(k0) or N_(k1). For example, the update is a Bayesian update. The Bayesian update may include p(θ)→p(θ)cos((2m_(k)+1)θ)^(N) ^(k0) sin((2m_(k)+1)θ)^(N) ^(k1) , for θ=πtϵ/2 for integer t∈[0,1/ϵ]. The updated probability distribution may be referred to as a posterior probability distribution.

The computing system (e.g., the classical computing system) determines 1240 θ based on the updated probability distribution. In some embodiments, the determined value of θ corresponds to the highest probability of the posterior probability distribution. Thus, θ may be determined by determining what value of θ corresponds to the highest probability (e.g., within a threshold error) of the posterior probability distribution. In some embodiments, θ is determined within error E with probability at least 0.9.

In some embodiments, the total number of times U is executed on the quantum computing system to determine θ scales as O(1/ϵ^(1+β)).

In some embodiments, the number of times the unitary operator U is sequentially executed in a single run scales as

${O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}.$

In some embodiments, the number is not more than

${O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}.$

6.2 Example Method for QoPrime Law Algorithm

FIG. 13 is a flow chart illustrating a second method of operating a quantum computing system to determine θ of a unitary operator U within an error ϵ, where U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

, according to one embodiment. The method may be performed with success probability at least p, where 1−2ke^(−2c)>p for constant c. The steps of method may be performed in different orders, and the method may include different, additional, or fewer steps. The method is described as being performed by a computing system that, for example, includes a classical computing system and quantum computing system. The computing system may perform the operations of the method by executing instructions stored on a non-transitory computer readable storage medium. The method in FIG. 13 may be considered an embodiment of the QoPrime algorithm. For example, the steps may be similar to the steps in Algorithm 3.1.

The computing system (e.g., the classical computing system) determines 1305 integer parameters k and q, where k≥2 and 1≤q≤(k−1). k specifies the number of moduli determined in step 1310. q specifies the number of moduli grouped together in step 1315. The parameters k and q may be determined using the techniques described in Section 4.

The computing system (e.g., the classical computing system) determines 1310 a set of k co-prime moduli (n₁, n₂, . . . , n_(k)). N=Π_(i∈[k])n_(i) describes the product of the co-prime moduli in the set. The co-prime moduli in the set may be determined so that N≥π/ϵ. Each modulus in the set is an integer. Additionally, the co-prime moduli may be odd. Alternatively, one of them may be even. The set of k co-prime moduli may be selected to be close to └1/ϵ^(1/k)┘. For example, the set includes k integers that have the smallest average difference from 1/ϵ^(1/k). In some embodiments, the set of k co-prime moduli are determined using techniques described in Section 3.3.

The computing system (e.g., the classical computing system) partitions 1315 the set of k co-prime moduli (n₁, n₂, . . . , n_(k)) into [k/q] groups π_(i) of size at most q.

At step 1320 (labeled “If Condition”), the computing system (e.g., the classical computing system) may perform the following operations. Responsive to q/k≥⅓, the computing system uses a number (e.g., greater than 0 and at most

$\left. \frac{24c}{\epsilon^{1 + {q/k}}} \right)$

of depth 0 samples to obtain an additive error (e.g.,

$\left. \frac{1}{2\epsilon^{1 - {q/k}}} \right)$

estimate for θ. A depth 0 sample refers to performing a single oracle call (executing a circuit with a single instance of the unitary operator U). This may be done at a threshold probability (e.g., at least 1−e^(−2c), where c is a constant). For more information, see step 4 of algorithm 3.1.

Responsive to q/k<⅓, the computing system may perform the method of FIG. 13 recursively to obtain an additive error estimate for θ. For example, the method is invoked recursively with accuracy ϵ′=O(1/ϵ^(1−q/k)) and q′/k′ such that

${1 + {q/k}} < {1 + {q^{\prime}/k^{\prime}}} < {\frac{\left( {1 + {q/k}} \right)}{\left( {1 - {q/k}} \right)}.}$

For more information, see step 6 of algorithm 3.1.

For i=1 to [k/q], steps 1325-1340 are performed in a loop.

For α=1 to A, steps 1325-1335 are performed in a loop. In some embodiments, the number of iterations A is 100 cN_(i) ², where N_(i)=Π_(j∈π) _(i) n_(j). As demonstrated in Lemma 3.6, this number of samples is sufficient to get to a good estimate with probability 1−2e^(−2c). However, the number of interactions may be more or less.

The quantum computing system executes 1325 the unitary operator U to generate a quantum state |ϕ_((N-N) _(i) _()/2N) _(i)

defined by |ϕ_(μ)

=cos((2μ+1)θ)|x, 0

+sin((2μ+1)θ)|x′, 1

. This may be implemented by the quantum computing system running a program that includes the unitary operator U as a subroutine. For example, |ϕ_((N-N) _(i) _()/2N) _(i)

is prepared by sequentially executing the unitary operator U (2((N−N_(i))/2N_(i))+1) times.

The quantum computing system measures 1330 the generated quantum state |ϕ_((N-N) _(i) _()/2N) _(i)

(e.g., in the standard basis). For example, the quantum computing system measures all of the qubits involved. In another example, the quantum computing system just measures the one or more qubits corresponding to the marked state.

The computing system (e.g., the classical computing system) records 1335 the measured quantum state.

The computing system (e.g., the classical computing system) determines 1340 M_(t) based on an observed probability that the measurement of particular qubit is 0 (this qubit may be predetermined). More generally, M_(l) is determined based on an observed probability of measuring 0 for a marked state, where a single qubit is may be chosen as the marked state. In some embodiments, determining M_(l) comprises: determining

$\hat{l} = \frac{2N_{i}}{\pi}$

arccos(√{square root over ({circumflex over (p)})}), where {circumflex over (p)} is an observed probability of outcome 0, and determining M_(l) =(−1)^(t){circumflex over (l)} mod N_(i), where t determines the sign of M_(l) (note that t is not related to the superscript t in |0^(t)

. Determining M_(l) may further comprise determining M_(i)=└M_(l) +β_(i)┘, where β_(i)∈[−0.25, 0.25] such that {M_(l) +β_(i)}=α and α is a number in the interval I=∩_(i)([M_(l) −0.25, M_(l) +0.25] mod 1). Steps 10-12 of Algorithm 3.1 may provide more information.

The computing system (e.g., the classical computing system) constructs 1345 M mod N by applying the Chinese Remainder Theorem (e.g., Theorem 3.1) to values based on M_(l) .

The computing system (e.g., the classical computing system) determines 1350 θ based on M. For example, θ is determined by computing

$\frac{\pi\left( {\overset{\_}{M} + \alpha} \right)}{2N}.$

In some embodiments, the total number of times the unitary operator U is executed by the quantum computing system determine θ scales as O(1/ϵ^(1+q/k)).

In some embodiments, the number of times the unitary operator U is sequentially executed in a single run scales as O(1/ϵ^(1−q/k)). In some embodiments, the number is not more than O(1/ϵ^(1−q/k)).

7. Description of a Computing System

FIG. 14A is a block diagram that illustrates an embodiment of a computing system 1400. The computing system 1400 includes a classical computing system 1410 (also referred to as a non-quantum computing system) and a quantum computing system 1420. The classical computing system 1410 may control the quantum computing system 1420. An embodiment of the classical computing system 1410 is described further with respect to FIG. 15. While the classical computing system 1410 and quantum computing system 1420 are illustrated together, they may be physically separate systems (e.g., in a cloud architecture). In other embodiments, the computing system 1400 includes different or additional elements (e.g., multiple quantum computing systems 1420). In addition, the functions may be distributed among the elements in a different manner than described.

FIG. 14B is a block diagram that illustrates an embodiment of the quantum computing system 1420. The quantum computing system 1420 includes any number of quantum bits (“qubits”) 1450 and associated qubit controllers 1440. As illustrated in FIG. 14C, the qubits 150 may be in a qubit register of the quantum computing system 1420. Qubits are further described below. A qubit controller 1440 is a module that controls one or more qubits 1450. A qubit controller 1440 may include a classical processor such as a CPU, GPU, or FPGA. A qubit controller 1440 may perform physical operations on one or more qubits 1450 (e.g., it can perform quantum gate operations on a qubit 1440). In the example of FIG. 14B, a separate qubit controller 1440 is illustrated for each qubit 1450, however a qubit controller 1450 may control multiple (e.g., all) qubits 1450 of the quantum computing system 1420 or multiple controllers 1450 may control a single qubit. For example, the qubit controllers 1450 can be separate processors, parallel threads on the same processor, or some combination of both. In other embodiments, the quantum computing system 1420 includes different or additional elements. In addition, the functions may be distributed among the elements in a different manner than described.

FIG. 14D is a flow chart that illustrates an example execution of a quantum routine on the computing system 1400. The classical computing system 1410 generates 1460 a quantum program to be executed or processed by the quantum computing system 1420. The quantum program may include instructions or subroutines to be performed by the quantum computing system 1420. In an example, the quantum program is a quantum circuit. This program can be represented mathematically in a quantum programming language or intermediate representation such as QASM or Quil.

The quantum computing system 1420 executes 1465 the program and computes 1470 a result (referred to as a shot or run). Computing the result may include performing a measurement of a quantum state generated by the quantum computing system 1420 that resulted from executing the program. Practically, this may be performed by measuring values of one or more of the qubits 1450. The quantum computing system 1420 typically performs multiple shots to accumulate statistics from probabilistic execution. The number of shots and any changes that occur between shots (e.g., parameter changes)) may be referred to as a schedule. The schedule may be specified by the program. The result (or accumulated results) is recorded 1475 by the classical computing system 1410. Results may be returned after a termination condition is met (e.g., a threshold number of shots occur).

FIG. 15 is an example architecture of a classical computing system 1410, according to an embodiment. The quantum computing system 1420 may also have one or more components described with respect to FIG. 15. Although FIG. 15 depicts a high-level block diagram illustrating physical components of a computer system used as part or all of one or more entities described herein, in accordance with an embodiment. A computer may have additional, less, or variations of the components provided in FIG. 15. Although FIG. 15 depicts a computer 1500, the figure is intended as functional description of the various features which may be present in computer systems than as a structural schematic of the implementations described herein. In practice, and as recognized by those of ordinary skill in the art, items shown separately could be combined and some items could be separated.

Illustrated in FIG. 15 are at least one processor 1502 coupled to a chipset 1504. Also coupled to the chipset 1504 are a memory 1506, a storage device 1508, a keyboard 1510, a graphics adapter 1512, a pointing device 1514, and a network adapter 1516. A display 1518 is coupled to the graphics adapter 1512. In one embodiment, the functionality of the chipset 1504 is provided by a memory controller hub 1520 and an I/O hub 1522. In another embodiment, the memory 1506 is coupled directly to the processor 1502 instead of the chipset 1504. In some embodiments, the computer 1500 includes one or more communication buses for interconnecting these components. The one or more communication buses optionally include circuitry (sometimes called a chipset) that interconnects and controls communications between system components.

The storage device 1508 is any non-transitory computer-readable storage medium, such as a hard drive, compact disk read-only memory (CD-ROM), DVD, or a solid-state memory device or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices. Such a storage device 1508 can also be referred to as persistent memory. The pointing device 1514 may be a mouse, track ball, or other type of pointing device, and is used in combination with the keyboard 1510 to input data into the computer 1500. The graphics adapter 1512 displays images and other information on the display 1518. The network adapter 1516 couples the computer 1500 to a local or wide area network.

The memory 1506 holds instructions and data used by the processor 1502. The memory 1506 can be non-persistent memory, examples of which include high-speed random access memory, such as DRAM, SRAM, DDR RAM, ROM, EEPROM, flash memory.

As is known in the art, a computer 1500 can have different or other components than those shown in FIG. 15. In addition, the computer 1500 can lack certain illustrated components. In one embodiment, a computer 1500 acting as a server may lack a keyboard 1510, pointing device 1514, graphics adapter 1512, or display 1518. Moreover, the storage device 1508 can be local or remote from the computer 1500 (such as embodied within a storage area network (SAN)).

As is known in the art, the computer 1500 is adapted to execute computer program modules for providing functionality described herein. As used herein, the term “module” refers to computer program logic utilized to provide the specified functionality. Thus, a module can be implemented in hardware, firmware, or software. In one embodiment, program modules are stored on the storage device 1508, loaded into the memory 1506, and executed by the processor 302.

Referring back to FIGS. 14A-14C, the quantum computing system exploits the laws of quantum mechanics in order to perform computations. A quantum processing device, quantum computer, quantum processor, and quantum processing unit are each examples of a quantum computing system. A quantum computing system can be a universal or a non-universal quantum processing device (a universal quantum device can execute any possible quantum circuit (subject to the constraint that the circuit doesn't use more qubits than the quantum device possesses)). Quantum processing devices commonly use so-called qubits, or quantum bits. While a classical bit always has a value of either 0 or 1, a qubit is a quantum mechanical system that can have a value of 0, 1, or a superposition of both values. Example physical implementations of qubits include superconducting qubits, spin qubits, trapped ions, arrays of neutral atoms, and photonic systems (e.g., photons in waveguides). For the purposes of this disclosure, a qubit may be realized by a single physical qubit or as an error-protected logical qubit that itself comprises multiple physical qubits. The disclosure is also not specific to qubits. The disclosure may be generalized to apply to quantum processors whose building blocks are qudits (d-level quantum systems, where d≥2) or quantum continuous variables, rather than qubits.

A quantum circuit is an ordered collection of one or more gates. A sub-circuit may refer to a circuit that is a part of a larger circuit. A gate represents a unitary operation performed on one or more qubits. Quantum gates may be described using unitary matrices. The depth of a quantum circuit is the least number of steps needed to execute the circuit on a quantum computer. The depth of a quantum circuit may be smaller than the total number of gates because gates acting on non-overlapping subsets of qubits may be executed in parallel. A layer of a quantum circuit may refer to a step of the circuit, during which multiple gates may be executed in parallel. In some embodiments, a quantum circuit is executed by a quantum computing system. In this sense a quantum circuit can be thought of as comprising a set of instructions or operations that a quantum computing system can execute. To execute a quantum circuit on a quantum computing system, a user may inform the quantum computing system what circuit is to be executed. A quantum computing system may include both a core quantum device and a classical peripheral/control device (e.g., a qubit controller) that is used to orchestrate the control of the quantum device. It is to this classical control device that the description of a quantum circuit may be sent when one seeks to have a quantum computer execute a circuit.

A variational quantum circuit may refer to a parameterized quantum circuit that is executed many times, where each time some of the parameter values may be varied. The parameters of a parameterized quantum circuit may refer to parameters of the gate unitary matrices. For example, a gate that performs a rotation about the y axis may be parameterized by a real number that describes the angle of the rotation. Variational quantum algorithms are a class of hybrid quantum-classical algorithm in which a classical computer is used to choose and vary the parameters of a variational quantum circuit. Typically, the classical processor updates the variational parameters based on the outcomes of measurements of previous executions of the parameterized circuit.

The description of a quantum circuit to be executed on one or more quantum computers may be stored in a non-transitory computer-readable storage medium. The term “computer-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) able to store instructions. The term “computer-readable medium” shall also be taken to include any medium that is capable of storing instructions for execution by the quantum computer and that cause the quantum computer to perform any one or more of the methodologies disclosed herein. The term “computer-readable medium” includes, but is not limited to, data repositories in the form of solid-state memories, optical media, and magnetic media.

The approaches described above may be amenable to a cloud quantum computing system, where quantum computing is provided as a shared service to separate users. One example is described in patent application Ser. No. 15/446,973, “Quantum Computing as a Service,” which is incorporated herein by reference.

8. Additional Considerations

The disclosure above describes example embodiments for purposes of illustration only. Any features that are described as essential, important, or otherwise implied to be required should be interpreted as only being required for that embodiment and are not necessarily included in other embodiments.

Additionally, the above disclosure often uses the phrase “we” (and other similar phases) to reference an entity that is performing an operation (e.g., a step in an algorithm). These phrases are used for convenience. These phrases may refer to a computing system (e.g., including a classical computing system and a quantum computing system) that is performing the described operations.

Some portions of above description describe the embodiments in terms of algorithmic processes or operations. These algorithmic descriptions and representations are commonly used by those skilled in the computing arts to convey the substance of their work effectively to others skilled in the art. These operations, while described functionally, computationally, or logically, are understood to be implemented by computer programs comprising instructions for execution by a processor or equivalent electrical circuits, microcode, or the like. Furthermore, it has also proven convenient at times, to refer to these arrangements of functional operations as modules, without loss of generality. In some cases, a module can be implemented in hardware, firmware, or software.

As used herein, any reference to “one embodiment” or “an embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment. Similarly, use of “a” or “an” preceding an element or component is done merely for convenience. This description should be understood to mean that one or more of the elements or components are present unless it is obvious that it is meant otherwise. As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, unless expressly stated to the contrary, “or” refers to an inclusive or and not to an exclusive or. For example, a condition A or B is satisfied by any one of the following: A is true (or present) and B is false (or not present), A is false (or not present) and B is true (or present), and both A and B are true (or present).

In addition, use of the “a” or “an” are employed to describe elements and components of the embodiments. This is done merely for convenience and to give a general sense of the disclosure. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise. Where values are described as “approximate” or “substantially” (or their derivatives), such values should be construed as accurate+/−10% unless another meaning is apparent from the context. From example, “approximately ten” should be understood to mean “in a range from nine to eleven.”

Alternative embodiments are implemented in computer hardware, firmware, software, and/or combinations thereof. Implementations can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions by operating on input data and generating output. As used herein, ‘processor’ may refer to one or more processors. Embodiments can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random-access memory. Generally, a computer will include one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM disks. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits) and other forms of hardware.

Although the above description contains many specifics, these should not be construed as limiting the scope of the invention but merely as illustrating different examples. It should be appreciated that the scope of the disclosure includes other embodiments not discussed in detail above. Various other modifications, changes, and variations which will be apparent to those skilled in the art may be made in the arrangement, operation, and details of the methods and apparatuses disclosed herein without departing from the spirit and scope of the invention. 

What is claimed is:
 1. A method of operating a quantum computer to determine θ of a unitary U operator within an error ϵ, wherein U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

, the method comprising: determining a parameter β∈(0,1) and N_(shots), where N_(shots) is an integer greater than 0; determining a probability distribution p(θ) for a set of angles θ; for k=1 to K: initializing N_(k0)=0 and N_(k1)=0; for i=1 to N_(shots): sequentially executing, by the quantum computer, the unitary operator $U\left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor$ times; measuring, by the quantum computer, a qubit of the resulting quantum state; updating a value of N_(k0) or N_(k1) based on the measured value of the qubit; performing a Bayesian update to the probability distribution p(θ) based on the updated values of N_(k0) and N_(k1); and determining θ based on the updated probability distribution.
 2. The method of claim 1, wherein a total number of times U is executed on the quantum computer to determine θ scales as O(1/ϵ^(1+β)).
 3. The method of claim 1, wherein the number of times the unitary operator U is sequentially executed in a single run scales as ${O\left( \frac{1}{\epsilon^{1 - \beta}} \right)}.$
 4. The method of claim 1, wherein determining the probability distribution p(θ) for the set of angles θ comprises determining a uniform distribution for a set of angles ${\theta = \frac{\pi\; t\;\epsilon}{2}},$ where (t∈[0,1/ϵ]).
 5. The method of claim 1, wherein updating the value of N_(k0) or N_(k1) based on the measured value of the qubit comprises: responsive to the measured value of the qubit being 0, updating the value of N_(k0) to be N_(k0)+1; and responsive to the measured value of the qubit being 1, updating the value of N_(k1) to be N_(k1)+1.
 6. The method of claim 1, wherein performing the Bayesian update includes p(θ)→p(θ)cos((2m_(k)+1)θ)^(N) ^(k0) sin((2m_(k)+1)θ)^(N) ^(k1) for θ=πtϵ/2 for integer t∈[0,1/ϵ], where $m_{k} = {\left\lfloor k^{\frac{1 - \beta}{2\beta}} \right\rfloor.}$
 7. The method of claim 1, wherein the θ corresponds to a highest probability of the updated probability distribution.
 8. The method of claim 1, wherein β is determined based on noise of the computing system.
 9. The method of claim 1, wherein θ is determined within error E with probability at least 0.9.
 10. The method of claim 1, wherein $K = {{\max\left( {\frac{1}{\epsilon^{2\beta}},{\log\left( {1/\epsilon} \right)}} \right)}.}$
 11. A method of operating a quantum computer to determine θ of a unitary operator U within an error ϵ, wherein U|0^(t)

=cos(θ)|x, 0

+sin(θ)|x′, 1

, the method comprising: determining integer parameters k and q, where k≥2 and 1≤q≤(k−1); determining a set of k co-prime moduli (n₁, n₂, . . . , n_(k)), where N=Π_(i∈[k])n_(i) is equal to or greater than π/ϵ; partitioning the set of k co-prime moduli (n₁, n₂, . . . , n_(k)) into [k/q] groups π_(i) of size at most q; for i=1 to [k/q]: for a number of iterations: executing, by the quantum computer, the unitary operator U to generate a quantum state |ϕ_((N-N) _(i) _()/2N) _(i)

defined by |ϕ_(μ)

=cos((2μ+1)θ)|x, 0

+sin(2μ+1)θ)|x′, 1

, where N_(i)=Π_(j∈π) _(i) n_(j); measuring, by the quantum computer, the quantum state; and recording the measured quantum state; determining M_(l) based on an observed probability of measuring 0 for a qubit; constructing M mod N by applying a Chinese Remainder Theorem to values based on M_(l) ; and determining θ based on M.
 12. The method of claim 11, wherein a total number of times U is executed on the quantum computer to determine θ scales as O(1/ϵ^(1+q/k)).
 13. The method of claim 11, wherein the number of times the unitary operator U is sequentially executed in a single run scales as O(1/ϵ^(1−q/k)).
 14. The method of claim 11, where |ϕ_((N-N) _(i) _()/2N) _(i)

is generated by sequentially executing the unitary operator U(2((N−N_(i))/2N_(i))+1) times.
 15. The method of claim 11, wherein the number of iterations is 100 cN_(i) ².
 16. The method of claim 11, where determining M_(l) based on the observed probability of measuring 0 for the qubit comprises: determining $\hat{l} = \frac{2N_{i}}{\pi}$ arccos(√{square root over ({circumflex over (p)})}), where {circumflex over (p)} is an observed probability of outcome 0; and determining M_(l) =(−1)^(t){circumflex over (l)} mod N_(i), where t is based on an additive error estimate for θ.
 17. The method of claim 11, further comprising: determining M_(i)=└M_(l) +62 _(i)┘, where β_(i)∈[−0.25, 0.25] such that {M_(l) +β_(i)}=α and α is a number in the interval I=∩_(i)([M_(l) −0.25,M_(l) +0.25] mod 1)
 18. The method of claim 17, wherein the Chinese Remainder Theorem is applied to values of M_(i).
 19. The method of claim 11, wherein determining θ based on M comprises determining θ based on $\frac{\pi\left( {\overset{\_}{M} + \alpha} \right)}{2N},$ wherein α is a number in the interval I=∩_(i)([M_(l) −0.25,M_(l) +0.25] mod 1).
 20. The method of claim 11, wherein θ is determined within error E with probability at least p, where 1−2ke^(−2c)>p for constant c. 